EMF Webinar 2 Homework

Reference: EMF User’s Guide

Viewing Raw Data

In this section, you’ll try out some of the options for viewing raw data in the EMF. In the Dataset Manager, find the dataset named ARINV_2007_AREA_Jan2012.txt. This ORL nonpoint inventory is part of the 2007 v3 EI project and contains annual emissions for stationary area sources.

Select the dataset in the Dataset Manager and click the View button to open the Dataset Properties View window. Switch to the Data tab and click the View button near the top of the tab. The Data Viewer window will open.

Look for the navigation widget in the top right corner of the Data Viewer window. You should see a line like:

Current: 1 - 100 Filtered: 144721 of 144721

The data window is currently showing records 1 through 100 of the dataset. There are 144,721 total records in the dataset. When initially viewing the data, no filter is applied so all the records are displayed.

Use the Row Filter to display only those rows where the annual emissions value is greater than zero. You will need to set up your row filter using the column ann_emis. Once you have entered your row filter, click the Apply button to apply your filter.

Questions

By default, the dataset is sorted by FIPS code, then by SCC code. Suppose you want to group the records by pollutant first. In the Sort Order textbox, change the sort order from FIPS,SCC to poll,fips,scc (the column names are not case sensitive). Click the Apply button to apply the new sort ordering.

Question

Next, let’s use the row filter to restrict the inventory to a single county like Fairfield County, CT; FIPS code 09001.

Question

Sort the data for county 09001 by annual emissions value. We want the records with the largest emissions to show up first so add the keyword desc after the column name to sort the records in descending order.

Questions

Use the navigation widget to page through the 4 pages of filtered data. The single left and right arrows move backwards and forwards one page at a time. The double arrows will take you to the first and last pages. Each time you change pages, the updated data has to be fetched from the server so it can take a little while for it to show up. Keep an eye on the Current: line to see if the new data has loaded.

You can use the slider to jump to any record in the data. Click and drag the slider handle. You’ll see the record number in the textbox change as you move the handle. When you let go of the handle, the page of data containing the selected record will be loaded (if you’re not already on the appropriate page). You can also type a record number directly into the textbox and then press the Enter key to load the appropriate page.

One final example to try. Set the row filter to fips = '09001' and avd_emis*365 > ann_emis. You may want to change the number of decimal places displayed from the default of 4 to 6 to show more digits in the columns. Type the number 6 into the Decimal Places textbox and click the Format button to apply the changes.

Question

Using QA Step Templates

For this exercise, you’ll be adding QA steps to the dataset named PTINV_2007_VADGUnits_march2010.orl. This dataset has the dataset type “ORL Point Inventory” and is part of the 2007 v3 EI project.

Add two QA steps using the templates for ORL point inventories: “Summarize by SCC and Pollutant” and “Summarize by SCC and Pollutant with Descriptions”. Tip: In the Add from Template dialog, you can select multiple templates to add multiple QA steps at once. Refer to Section 4.2 Add QA Step From Template.

IMPORTANT! Please edit each QA step you create and add your initials to the QA step’s name. The EMF requires every QA step associated with a dataset to have a unique name. When adding a QA step from a template, the EMF will automatically use the name of the template (i.e. Summarize by SCC and Pollutant) as the new QA step’s name. If a QA step with that exact name already exists, the EMF won’t be able to create a new QA step.

Question

After editing each QA step’s name, run the QA steps that you added. You can run each QA step individually from the Edit QA Step window or you can select multiple QA steps in the QA tab of the Dataset Properties Editor window and click the Run button at the bottom of the window. Check the Status window for the status of each run. Refer to Section 4.4 Running QA Steps.

Question

After each QA step has finished running, open the Edit QA Step window for each step. Confirm that the QA step’s Run Status is now “Success” and that the Run Date is recent.

Question

View the results of each QA step by clicking the View Results button. The QA step results viewer window allows you to sort and filter the data records just like the Dataset Manager window.

Question

Export the results of the QA steps you created to a local file. Be sure to check the box next to Download result file to local machine?. Refer to Section 4.5 Exporting QA Step Results.

Open the exported files in Excel on your local machine. After you have reviewed the exported results, set the QA Status for each QA step that you created to Complete (refer to Section 4.2 Add QA Step From Template).

Exercise

Use a row filter when exporting your QA step’s results to export only data for Electric Generation Internal Combustion Engines (i.e. SCCs beginning with 201) for which annual NOx emissions are greater than 70 tons/year.

Questions

Comparing Datasets

Loaded onto the MARAMA EMF server are twelve ORL nonroad inventories containing monthly average-day emissions for 2007. These inventories are assigned to the 2007 v3 EI project.

Using Section 4.8 Compare Datasets QA Program as a guide, create and run a custom QA step to compare the average-day emissions for quarter 1 (January, February, and March) to the corresponding quarter 2 emissions (April, May, and June). Your QA step should compare the average-day emissions by SCC and pollutant.

When you create your QA step, add it to the January inventory. You’ll need to give your QA step a unique name like “Q1 vs. Q2 - initials”. When setting the arguments for the Compare Datasets QA program, set the Base Field Suffix to “q1” and the Compare Field Suffix to “q2” to make it easier to interpret the results.

Questions

Exercise

Export the results of your comparison QA step. Use a row filter to only include records where the emissions changed (increased or decreased) by more than 100%.

Question

Exercise

For this exercise, you’ll create a QA step to compare annual emissions from 2007 nonpoint inventory data with the corresponding data for 2011. Start by locating the 2007 ORL nonpoint inventory dataset named:

For this dataset, create a custom QA step that uses the Compare Datasets QA program. Make sure to give your QA step a unique name. When setting up the Compare Datasets arguments, include the following 8 2011 FF10 nonpoint inventories as the comparison datasets:

For your comparison report, you can choose what detail level you’d like by setting the Group By Expressions. At a minimum, you’ll aggregate the annual emissions by county and pollutant code. For a more detailed report, you could include partial SCCs as a grouping expression using substr(scc,1,3) to group by the first 3 digits of the SCC code.

The two inventory dataset types don’t use the same names for all of the data fields. This means that you’ll need to use Matching Expressions when setting up the Compare Datasets arguments. You can find the names of the data fields by viewing the raw data for the dataset. For this exercise, the fields of interest are listed below.

Column ORL Nonpoint FF10 Nonpoint
County FIPS REGION_CD
SCC code SCC SCC
Pollutant code POLL POLL
Annual emissions ANN_EMIS ANN_VALUE

The 2011 inventories contain data for more regions and pollutants than the 2007 inventory. You’ll use a Where Filter to limit the records that will be compared. The following Where Filter lists all the states and pollutants of interest from the 2007 inventory. We’re skipping PM emissions because of a change in how PM emissions were reported in the 2011 inventories. For your report, start by comparing emissions just in your state.

substr(fips,1,2) IN ('09', '10', '11', '23', '24', '25', '33', '34', '36', '42', '44', '50', '51') and poll IN ('CO', 'NH3', 'NOX', 'SO2', 'VOC')

There are several different ways you could set up the arguments needed by the Compare Datasets QA program. Suggestions for each argument are given below.

Group By Expressions:

fips
poll

Aggregate Expressions:

ann_emis

Matching Expressions:

fips=region_cd
ann_emis=ann_value

Where Filter:

substr(fips,1,2) = '09' and poll IN ('CO', 'NH3', 'NOX', 'SO2', 'VOC')

Base Field Suffix:

2007

Compare Field Suffix:

2011