Data Sets - Maple Help

 Data Sets

Maple 2015 features a new infrastructure for accessing and working with data sets from various builtin and online data sources. The DataSets package contains functions for retrieving and working with data sets, as well as GUI elements for the same purposes. This package is able to access time series data from the data aggregator Quandl, as well as locally installed data from countries and cities.

To try the examples on this page, select View > Open Page as Worksheet, and then execute the commands.

 > $\mathrm{with}\left(\mathrm{DataSets}\right)$
 $\left[{\mathrm{Builtin}}{,}{\mathrm{GetData}}{,}{\mathrm{GetDescription}}{,}{\mathrm{GetHeaders}}{,}{\mathrm{GetIdentifier}}{,}{\mathrm{GetName}}{,}{\mathrm{InsertSearchBox}}{,}{\mathrm{Quandl}}{,}{\mathrm{Reference}}{,}{\mathrm{Search}}\right]$ (1)

The help search box can search through any available data source by keyword, returning a list of related data sets. For example, searching for "Gold" returns the following results, see figure 1.1. You can see a description of each data in a tooltip by hovering over the data set entry.

Figure 1.1: Search Box Results

Choosing the first item on this list inserts a data set reference object, which can then have further interacted with using the right-click context menu. Note that the third row of the data reference shows the source for the data.

It is also possible to perform an advanced search for more relevant data sets. By clicking on the "Data Sets" label in the help search box, an advanced search box with more results is returned:

 > $\mathrm{DataSets}:-\mathrm{InsertSearchBox}(\mathrm{search}={"Gold"})$

Data Set Search
 1. Gold Price (USD)2. Net Debt for many countries3. Great Basin Gold (GBG)4. Yamana Gold ( AUY ) - Cash5. Andes Gold Corp ( AGCZ ) - Cash6. Andes Gold Corp ( AGCZ ) - Depreciation7. Andes Gold Corp ( AGCZ ) - Dividends8. Andes Gold Corp ( AGCZ ) - Revenues9. Yamana Gold ( AUY ) - Depreciation10. Yamana Gold ( AUY ) - Dividends Source: https://www.quandl.com/BUNDESBANK/BBK01_WT5511

The advanced search can also be programmatically accessed using the InsertSearchBox command from the DataSets package. You can specify a variable name to which the chosen data set will be assigned - either as an argument to InsertSearchBox, or in the user interface. Following is a search for unemployment data for Australia. The data set selected is Unemployment Level In Australia, which corresponds to Quandl's data set UNDATA/GID_UNEMP_AUS.

 > $\mathrm{InsertSearchBox}\left(\mathrm{search}="Australia Unemployment"\right)$

Data Set Search
 1. Unemployment Men - Annual - Australia2. Unemployment Level In Australia3. Unemployment Rate - Annual - Australia4. Unemployment Rate - Quarterly - Australia5. Unemployment Rate - Quarterly - Australia6. Unemployment Rate - Quarterly - Australia7. Unemployment Men - Quarterly - Australia8. Unemployment Rates - Australia9. Unemployment - Quarterly - Australia10. Unemployment Men - Quarterly - Australia Source: https://www.quandl.com/UNDATA/GID_UNEMP_AUS

If you select a data set in the left-hand list, it will be assigned to the variable name given at the top. In a fresh session, that name should initially be ds. In this example, the default is ds0, since ds has already been used in a previous example. You can see that ds0 now evaluates to the selected data set (if you have performed the search).

 > $\mathrm{ds0}$



Exploring Quandl Data Sets

Maple 2015 contains an interface to the time series data provided by Quandl. Quandl hosts data from hundreds of publishers, comprising over 12 million data sets, on their web site and through an API.

Note that this worksheet will pre-load some data from Quandl for an animation near the bottom. While the data is being loaded, you cannot run other Maple code. The text below indicates the status of this process.

Another way to look for data in Maple is to issue the Search command. If you provide a string, you get the top few data sets for the given string. For example, following are some results related to peanuts, peanut butter, butter, and/or Canada. You can limit the search to top 10 results.

$\mathrm{Search}\left("peanut butter canada",'\mathrm{maxhits}'=10\right)$

The first data set has exactly the information you were looking for: peanut butter information for Canada. This data set corresponds to the Quandl reference code: FAO/FAO_33CANADA247PEANUTBUTTER.

You can now interact with this data in a few ways, one of which is using commands. You can find out some metadata; for example, what the four columns represent:

$\mathrm{GetHeaders}\left(\mathrm{data}\right);$

 $\left[{"Import Quantity \left(tonnes\right)"}{,}{"Import Value \left(1000 US\right)"}{,}{"Export Quantity \left(tonnes\right)"}{,}{"Export Value \left(1000 US\right)"}\right]$ (1.1.1)

You can also convert the data into, say, a $\mathrm{Matrix}$:

Notice that the matrix for peanut butter information for Canada showed a larger number of rows than the dimension of M. This is because this data is only a data set reference: it contains metadata about how to retrieve the given data, but not necessarily the data itself. However, now that you have retrieved the data, it displays as follows:

$\mathrm{data}$

You can also plot a dataset reference, using, for example, the dataplot command.

$\mathrm{dataplot}\left(\mathrm{data}\right)$

It is also possible to operate on the data, to create new values, and to select certain values, by using indexing. For example, to select years where the export value was more than 15 million US$(that is, 15000 units of 1000 US$ each), you can do this:

$\mathrm{dataplot}\left(%\right);$

Or, to create a new time series for the ratio between import and export quantities (in tonnes), you can do the following:

$\mathrm{dataplot}\left(%\right);$

You can see that no export or import was recorded before about 1990; but since then, import has grown relatively.

All these interactions are also available through the context menu. If you right click on "data", shown below, you can select Plots. The context menu offers an easy interface to the powerful TimeSeriesPlot command: you can select to show all columns of the data in a single plot, all columns in separate plots, or select one of the columns you want to see.

$\mathrm{data}$$\to$

Filtering of data is also available, by date or other quantities, as is selecting subsets of the columns. For example, if you are interested only in the export quantity and value, you can select Select columns from the context menu, select the columns of interest and hit OK, right click on the resulting time series, and select Assign to a name and select, say, exportdata as the name. You can then run a command such as the following, to plot all the export data.

$\mathrm{data}$$\to$ $\stackrel{\text{assign to a name}}{\to }$${\mathrm{exportdata}}$



$\mathrm{dataplot}\left(\mathrm{exportdata}\right);$

${}$

Example: Exploring Economic Indicators for Several Countries

Finally, there is some supporting code in the start-up region of this work sheet (to see it, click the gear-and-cog icon in the worksheet toolbar). This code loads a number of data sets upon opening this worksheet, and it powers the Explore command, which shows several measures for economic output for 12 important economies (all in US Dollars), and time series for one selectable country. For the time series on the lower-right, if you click on a curve (not the data points on the curve!), the corresponding curve in the legend will be highlighted, and vice versa.

There is some explanation of these data sources following the result from Explore.

$\mathbf{year}$

$\mathbf{country}$

Data Sources

The following uses the descriptions for the US data, but the same descriptions apply to the equivalent data sets for other countries.

 GDP (at PPP) United States Country GDP based on PPP Valuation, USD Billions. Units: Current international dollar. Multiplier: Billions. Estimates begin after 2013. This data is sourced from www.opendataforafrica.org where it is offered under an open data licence (www.opendataforafrica.org/legal/termsofuse).  These data form the basis for the country weights used to generate the World Economic Outlook country group composites for the domestic economy.   The IMF is not a primary source for purchasing power parity (PPP) data. WEO weights have been created from primary sources and are used solely for purposes of generating country group composites. For primary source information, please refer to one of the following sources: the Organization for Economic Cooperation and Development, the World Bank, or the Penn World Tables.   For further information see Box A2 in the April 2004 World Economic Outlook, Box 1.2 in the September 2003 World Economic Outlook for a discussion on the measurement of global growth and Box A.1 in the May 2000 World Economic Outlook for a summary of the revised PPP-based weights, and Annex IV of the May 1993 World Economic Outlook. See also Anne Marie Gulde and Marianne Schulze-Ghattas, Purchasing Power Parity Based Weights for the World Economic Outlook, in Staff Studies for the World Economic Outlook (Washington: IMF, December 1993), pp. 106-23.  See notes for:  Gross domestic product, current prices (National currency).. Real GDP GDP at purchaser's prices is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources. Data are in constant 2000 U.S. dollars. Dollar figures for GDP are converted from domestic currencies using 2000 official exchange rates. For a few countries where the official exchange rate does not reflect the rate effectively applied to actual foreign exchange transactions, an alternative conversion factor is used.\nGDP (constant 2000 US$) Net Nat'l Income Adjusted net national income is GNI minus consumption of fixed capital and natural resources depletion.\nAdjusted net national income (constant 2000 US$) Gross Value Added United States of America Gross Value Added by Kind of Economic Activity at current prices - US dollars. The National Accounts Main Aggregates database is produced and maintained by the Economic Statistics Branch of the United Nations Statistics Division (UNSD), with input from the UNSD, international organizations, and the national statistical agencies of more than 200 countries.  The database consists of a complete and consistent record of the main National Accounts of each UN member state, from 1970 onwards.  Records are based on official country data reported to the UNSD, supplemented with estimates for any years and countries with incomplete or inconsistent information.  The database is updated in December of each year. For more about the National Accounts Main Aggregates database, see http://unstats.un.org/unsd/snaama/introduction.asp.  To see country-specific metadata, see http://unstats.un.org/unsd/snaama/metasearch.asp?letter=U.