SCIN 137 AMU week 6 lesson lab Hurricane Matthew Lab Introduction to Meteorology American Military university
Topics to be covered include:
- Labeling and curating data
- Role of spreadsheets in data management
- Role of graphs in visualization
- Common graphs and when to use them
In the last lesson, we established the foundation for statistical analyses by learning the basic principles of data collection and measurement. We defined the difference between qualitative and quantitative data, accuracy versus precision, and the importance of accurately reporting numbers based on significant digits. For this lesson, we will learn how to organize our data in spreadsheets and create graphs and figures to visualize patterns in the data. We will define three of the most commonly used graphs for data visualization, explain when to use them in statistical analyses, and practice creating graphs using sample data sets.
Data Storytelling: An Essential Skill for Job Hunters
In today’s increasingly technology-dependent world, more industries than ever are relying on data to inform their decisions. Businesses use data to determine which marketing practices are most successful at reaching customers and selling products. Website designers use programs like Google Analytics to track the number of visitors to their sites. Scientists use quantitative data to determine relationships and patterns between variables in their experiments. Pharmaceutical companies rely on clinical trials to measure the efficacy and safety of new drugs for functions such as treating diseases and preventing pregnancy.
This ubiquitous prevalence of data also dictates a need for people who can interpret data and communicate their findings in layman’s terms. On LinkedIn, a frequently used networking site, data analysis is one of the hottest skills for job recruiters and is consistently ranked in the top four most in-demand skills across multiple countries (Dykes, 2016). Data analysis skills tend to favor those with degrees in economics, mathematics, or statistics, but understanding the numbers only represents half the battle. To get the most benefit from data, analysts also need to be able to communicate their findings effectively, a skill sometimes called “data storytelling” (Dykes, 2016).
Data storytelling encompasses skills such as creating graphs and figures, understanding the computer programs used to analyze the data, and communicating those results through easy-to-understand oral and written presentations. As more companies and disciplines rely on data, the demand will increase for people who can interpret and effectively communicate those findings, along with computer tools and programs to make data analysis more accessible and user-friendly. As you complete your college degree and begin job searching, consider the value of gaining additional skills in data analysis and public speaking to help you compete in today’s tech-savvy job market.
Data Analysis: From Measurement to Interpretation
Imagine that you are conducting background research for an experiment you are about to design or are reading papers to determine which ones to cite for a manuscript that you are preparing to submit. If you look at the reference section of any scientific manuscript, you could potentially find dozens of papers that the authors have cited to provide background information or support for their findings. Reading so many sources represents an enormous, time-consuming task, so scientists will often prioritize reading certain sections to obtain the most information in the shortest time.
Scientific manuscripts are typically broken down into four sections: Introduction, Methods, Results, and Conclusion, as well as an Abstract at the beginning, which we have practiced analyzing in previous lessons. When time is of the essence, researchers prioritize the Abstract to get an overview of the study and the Results section, which summarizes the data with visually appealing tables, graphs, and figures. Based on those two sections, they can quickly determine if the manuscript is relevant to their current pursuit.
As we continue this course, we now turn our attention to that all-important Results section, where we boil down our data into graphs and figures that tell a succinct, visual story. The Results often represent the shortest section of a scientific paper, but creating each graph requires extensive data management and organization, statistical analyses, and the ability to choose which figure is most appropriate to visualize the data. We will first discuss the importance of labeling and organizing data in spreadsheets to make statistical analyses easier. Then we will breakdown how to interpret figures and discuss three of the common graphs that researchers use to display their data.
Use Vertical Columns, Borders, and Colors to Visually Organize Data
Consider a scenario where you are comparing four different freshwater lakes to determine the effects of nutrient levels on dissolved oxygen. Dissolved oxygen refers to the amount of free oxygen present in the water, and too much or too little oxygen can harm aquatic life and water quality (Fondriest Environmental Inc., 2016). Runoff from plant fertilizers can increase the amount of dissolved nutrients, such as nitrogen and phosphorus, in the water and promote the growth of aquatic plants and algae. As these plants and algae grow and die, bacterial decomposition increases and consumes dissolved oxygen (Fondriest Environmental Inc., 2016).
Imagine that you collect data from four lakes on five different dates and record water temperature, dissolved oxygen, and dissolved nitrogen. You would compile the data in an Excel spreadsheet. As a general rule of thumb, whenever you format a spreadsheet, organize your data in vertical columns, with the first row serving as headers. If you have a larger data set, you might consider additional formatting options to help organize your data and quickly scan a spreadsheet to find the variables you need to create a graph or perform a statistical test. In Microsoft Excel, you can highlight cells and access the “Format Cells” function to select borders to outline your cells. You can also adjust fonts and text sizes, adjust cell sizes, and color-code cells based on personal preference. There are many options for formatting your spreadsheet, too many to list here, therefore we recommend you consult the latest Excel help guides and tutorials for any specific formatting you need.
When Applicable, Use Excel’s Built-in Formulas for More Advanced Statistics
For this class, we will focus exclusively on performing statistical analyses and creating figures in Microsoft Excel due to its prevalence in universities and businesses. This program will most likely handle most, if not all, of your statistical needs, and it is capable of far more advanced procedures than we have time to cover in this course. If you anticipate going into a major or job with extensive data analysis, such as business, accounting, or science, consider taking an extra course or picking up a book that delves into this program. Excel also works well as a starting point to use other more specialized statistics programs. For example, programs such as SigmaStat and Minitab can import Excel spreadsheets and allow the user to run statistical analyses by clicking through a series of menus to select which test to run, display the results, and create figures. R is an open source programming language that users can download for free and write computer code to perform statistical analyses and create figures. R also gives the option to import Excel spreadsheets as .csv files and allows the user immense freedom to analyze data as long as you understand how to enter the source code.