Analysis-Services/UsqlScripts
2017-08-02 14:16:29 -07:00
..
all_single Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
large_multiple Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
last_available_year Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
Modelling Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
readme.rtf RTF version with hyperlinks. 2017-08-02 14:16:29 -07:00
readme.txt Providing a text version of the readme. 2017-08-02 14:13:19 -07:00

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

U-SQL Scripts for Processing a TPC-DS Data Set
The U-SQL scripts for processing a TPC-DS data set demonstrate how to use Azure Data Lake Analytics to prepare raw data for import into an Azure Analysis Services data model. For a detailed discussion, see the blog article “Using Azure Analysis Services on Top of Azure Data Lake Storage” on the Analysis Services Team Blog.
To use these scripts, the TPC-DS data set must be generated by using the dsdgen tool, which can be downloaded as source code from the TPC-DS web site. Run the dsdgen tool with /PARALLEL 100 and /CHILD ids ranging from 1  100 to generate the source files with the expected file naming conventions and place the source files in an Azure Blob Storage account, as discussed in “Building an Azure Analysis Services Model on Top of Azure Blob Storage—Part 2” on the Analysis Services Team Blog. Finally, edit the U-SQL scripts and replace the storage account placeholder (@<blob storage account name>) with your actual storage account.
The subfolders containing the U-SQL scripts highlight different scenarios:
* all_single   These scripts create a single csv file per table containing all the source data.
* large_multiple   These scripts 4 csv files for each of the large tables (catalog_returns, catalog_sales, inventory, store_returns, store_sales, web_returns, and web_sales) and a single csv file for each of the remaining tables.
* last_available_year   These scripts create a single csv file per table containing only the source data for the last year in the data set, which is the year 2003.
* modelling    These scripts create a data set for modelling purposes with a single csv file per table containing up to 100 rows of data.