Skip to content

denisrobert/ExData_Plotting1

 
 

Repository files navigation

Exploratory Data Analysis, Course Project 1

The goal of this project is to produce 4 exploratory graphs illustrating aspects of a data set on electric power consumption extracted from the UC Irvine Machine Learning Repository.

The complete dataset can be found at Electric power consumption [20Mb]

For the purposes of this project, only two days' worth of data has been retained, for the period covering 2007-02-01 and 2007-02-02 only. Each script described below reads in the entire dataset, but filters out data for other dates prior to generating the graphs.

In the source dataset, the variables are as shown below:

  1. Date: Date in format dd/mm/yyyy
  2. Time: time in format hh:mm:ss
  3. Global_active_power: household global minute-averaged active power (in kilowatt)
  4. Global_reactive_power: household global minute-averaged reactive power (in kilowatt)
  5. Voltage: minute-averaged voltage (in volt)
  6. Global_intensity: household global minute-averaged current intensity (in ampere)
  7. Sub_metering_1: energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered).
  8. Sub_metering_2: energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light.
  9. Sub_metering_3: energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner.

The only modifications made to the dataset prior to generating the graph have been:

  • Converting "?" characters to "NA" in order to be correctly processed as missing values by R.
  • Converting numeric values in character for to proper numeric values.
  • Combining Date and Time columns into an additional, single POSIXct DateTime column in order to make filtering and graphing simpler.

Scripts included in the project

There are 6 script included in the project.

read_data.R: A shared script, sourced by the other scripts in order to read the dataset and prepare it for graphing.
plot[1-4].R: Four scripts to generate successively more complex graphs, matching the requirements of the exercise.
generate_plots.R: A simple utility to unzip the dataset, generate all plots, then remove the unzipped datafile.

Running the scripts.

Before running the scripts, make sure to set your current directory to the folder containing the scripts and dataset, whether you source the script directly in an R session, or run Rscript at the command line.

Next, unzip the household_power_consumption.zip file into the current directory.

Then, run each of the plot[1-4].R scripts in turn. Each will generate a file named plot[1-4].png, sized 480px x 480px in the current directory.

About

Plotting Assignment 1 for Exploratory Data Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • R 100.0%