Social Journalism Spring 2015
Room 438 | Wednesdays, 2:30 - 5:30 |
---|---|
Amanda Hickman | amanda.hickman@journalism.cuny.edu |
Tumblr: | http://dataskills.tumblr.com |
Class Wiki: | https://github.com/amandabee/CUNY-data-skills/wiki |
Assignments: | google drive |
Issues: | https://github.com/amandabee/CUNY-data-skills/issues |
It isn’t hypberbole: journalists today have access to more data than ever before, as well as to better tools to understand that data and retell the stories it holds. Whether you want to understand your own audience better, measure the impact of efforts to expand your reach, or just tell stories about the impact of policy, a little bit of data can go a long way, if you have the skills to put data to use.
This semester we will work together to gather, analyze and visualize numbers you need to understand your audience and to tell interactive data-driven stories.
We’ll look at the data that helps us listen to audiences – who are they and what are they asking? We’ll gather data that can help answer those questions, interrogate the data to make sure we have clear and realiable answers, and we’ll present that data in clear engaging stories and reports.
We’ll use Excel (or LibreOffice’s Calc) and some command-line tools like CSVkit to dig into numbers and we’ll use web-based tools such as CartoDB and HighCharts to create maps and charts that clearly illustrate your findings. You’ll pick up a little HTML, CSS and jQuery along the way – just enough to show off your work online. This is not a course in coding, but programmers of all skill levels are welcome.
Note: this class will go deep on some data analysis tools, such as spreadsheets, that you will be expected to master. We will aslo introduce more complex advanced tools that you won’t master in a semester.
Lecture: what you can expect from me | Homework: what I expect from you |
---|---|
Jan 28: Finding and defining data, Context | Readings |
Feb 04: Visual Encoding, CSVs, Pivot Tables | Pre-pitches – data you’re interested in |
Feb 11: Cleaning data with OpenRefine, FTP | Spreadsheet exercise, Pitch (community profile) |
Feb 18: Mapping | Data cleaning exercise |
Feb 25: Finding patterns with maps | Map exercise, Storyboard (community profile) |
Mar 04: Charts, Visual Encoding | Map exercise 2, Pitch (data driven story) |
Mar 11: Presentation, Navigation, Bootstrap | Chart exercise |
Mar 18: Completeness, Advanced Chart Layout | Chart exercise 2, Storyboard (data driven story) |
Mar 25: Building forms, Writing surveys | Rough Draft (community profile) |
Apr 01: Community Study critique | Survey exercise, Final (community profile) |
Apr 08: Spring Break | |
Apr 15: Command line tools, CSVkit | Revised survey exercise, install CSVkit |
Apr 22: Show your work, publishing numbers | Rough Draft (data driven story) |
Apr 29: Hands on TBD | CSVkit Excercise, Final (data driven story) |
May 06: Data Driven Crit, Hands on TBD | Revisions Community Study |
May 13: Wrap Up | Revisions Data Driven |
Project | Pitch | Storyboard | Draft | Final | Revision |
---|---|---|---|---|---|
Community Profile | Feb 11 | Feb 25 | Mar 25 | Apr 01 | May 06 |
Data Driven Story | Mar 04 | Mar 18 | Apr 22 | Apr 29 | May 13 |
Course outcomes
This semester you’ll learn to:
The skills we build in this course will be as applicable to reporting as they are to the work of interpreting signals from your audience.
Amanda Hickman works at the intersection of journalism and civic engagement, and especially values reporting that makes it easier for individuals to participate in democratic processes. As program director at DocumentCloud, she helped reporters around the world analyze, annotate, and publish primary source documents. Amanda managed development of a series of games about public policy issues as Gotham Gazette‘s director of technology. She has spent more than a decade reporting on local and international events and working directly with community based organizations to understand, and draw their membership into, the political process. Amanda has taught at Columbia Graduate School of Journalism, NYU’s Gallatin School, the CUNY Graduate Center and CUNY Graduate School of Journalism.
We’ll be using a handfull of free and open source software tools this semester and a few that are just free of charge:
You should already have TextWrangler and Excel installed. You’ll need both. If you don’t have Excel, you can also use LibreOffice Calc for all our spreadsheet work.
You will need to create accounts on JS Fiddle, CartoDB and Stack Exchange GIS. Important! Before you create your CartoDB account, make sure you have the information you need to get the student discount. Their standard free option is not adequate to our needs and upgrading is much more difficult than using the discount in the first place.
Students will work in pairs to complete two major assignments. In addition, regular homework exercises will reinforce skills we’ve learned in class.
Assignments will be timed to allow you to dig deeper into your work in Information and Communities and your reporting course. You will complete two major assignments: one a community profile, the other a reported data-driven story. Students will work in pairs on the two major assignments and will develop a compelling pitch, a clear storyboard, a comprehensive rough draft and a complete final piece for each.
We have ready access to rich and varied data about communities of all shapes, whether those communities are joined by geography, interests, heritage, age or a combination of all of those. We’ll use that data add quantitative understanding to what you already know about a community.
Students will work in pairs to compile a portfolio of data about a single community, based primarily on publicly available data, though where appropriate students are welcome to incorporate proprietary intelligence. Identify at least eight salient characteristics (is income relevant? educational attainment? is this community anchored in a single location or geographically dispersed?) and points of comparison (neighboring communities, for instance, in the case of a geographically-based community), and find data to quantify those characteristics.
Your final product will be a short report that describes the community in numbers and puts those numbers in context.
Just how many hospitals have closed in New York State in the last two years? Where were they and who was impacted? Do low-income New Yorkers have better access to fresh produce than they did ten years ago? Can they walk to green markets? Where are the most dangerous intersections in New York City? How many of them are near schools? These are all examples of questions we can answer with data that is already accessible to the public.
For the data driven story, students will work in pairs to identify newsworthy data, pitch and report a story of no more than 600 words that includes at least two visualizations of that data. The story should have news value and the reporters should demonstrate a clear understanding of the data and its limitations. Students should speak with as many experts as necessary to write responsibly about these numbers.
Participation and attendance are required and are important to your success in the class. This is a fast moving, skills based class. We’ll be tackling new tools every week and it will be very, very difficult to get caught up if you miss class.
Please be on time for class and back to class from breaks.
Your grade is determined by three factors: active participation in class, homework assignments, and the two major team assignments.
Course Grade Overall | |
---|---|
Participation | 20% |
Homework assignments | 20% |
Community Profile | 30% |
Data Driven Story | 30% |
Grades for your two team assignments are further broken down as follows:
Assignment Grades | |
---|---|
Pitch | 25% |
Storyboard | 15% |
Draft | 25% |
Final | 25% |
Revision | 10% |
This means that if you complete a brilliant story or profile but don’t put real effort into your pitch or rough draft, you can’t get better than a C on the story.
Pitches: A complete pitch should tell me who cares, why we care now, and what pre-reporting you’ve done. You must include:
Storyboards: A storyboard organizes your content conceptually and spatially. This semester, when you turn in storyboards, you should also include a revised pitch and a proposed nut graf. Your nut graf will change your story develops, but it should capture all of the main elements of your story.
We use wireframe and storyboards interchangeably here. We’re looking for a simple sketch (on paper, in Word, or PowerPoint, Illustrator, or any number of online storyboarding tools) that shows us how you intend to integrate your visualizations, words, and navigation elements. Use simple boxes to tell us where your different elements will be positioned in a design, and how a user will navigate through the content. Check out Mark Luckie’s thoughts on sketching/storyboarding, with examples, from 10,000 Words.
Rough Drafts: A rough draft does not have to have the polish of a final project, but it should be close. You should have created the visualizations that you plan to use. Your classmates should be able to evaluate a rough draft on its merits, without a guided tour of forthcoming features. A complete rough draft includes:
Final Story: Your story must be available online (most of you will be using gist/bl.ocks), and the URL added to the assignments document
It is a serious ethical violation to take any material created by another person and represent it as your own original work. Any such plagiarism will result in serious disciplinary action, possibly including dismissal from the CUNY J-School. Plagiarism may involve copying text from a book or magazine without attributing the source, or lifting words, code, photographs, videos, or other materials from the Internet and attempting to pass them off as your own. Please ask the instructor if you have any questions about how to distinguish between acceptable research and plagiarism.
In addition to being a serious academic issue, copyright is a serious legal issue.
Never “lift” or “borrow” or “appropriate” or “repurpose” graphics, audio, or code without both permission and attribution. This applies to scripts, audio, video clips, programs, photos, drawings, and other images, and it includes images found online and in books.
Create your own graphics, seek out images that are in the public domain or shared via a creative commons license that allows derivative works, or use images from the AP Photo Bank or which the school has obtained licensing.
If you’re repurposing code, be sure to keep the original licensing intact. If you’re not sure how to credit code, ask.
The exception to this rule is fair use: if your story is about the image itself, it is often acceptable to reproduce the image. If you want to better understand fair use, the Citizen Media Law Project is an excellent resource. </a>
When in doubt: ask.
Festival of Data: Every week one student will choose a data driven story to present in class. Prepare to discuss the strengths and weaknesses of the story, the authors’ use of data as well as their use of interactivity, and to identify the underlying technology. Blog your story in the “Festival of Data” category by 5 PM on your week.
Weekly Reading:
Due Week 01: Read: Searches for “sundown” on Yom Kippur, Homicide Watch intervew, Homicide Watch on Search Queries
Discussion: Welcome and Expectations
What is data, what are data stories?
How does data provide context? What can’t data answer?
Discussion: How does data help you understand a community?
Discussion: Finding data
Festival of Data: [“In Climbing Income Ladder, Location Matters”][4]
Due Week 02: Pre-pitches: Identify three datasets that interest you. Write a short blog post that describes the provenance of the data (who maintains it?), where the data can be found (include a link) and in less than 200 words each, explain why the data is interesting. This can be data that gives context to the community you’re working in, or data that is worth doing some reporting on. Post these pre-pitches to the class blog.
Make sure that Firefox is installed on your computer, with the Web Developer Toolbar. Install Tabula, Create an Academy account with CartoDB – start from http://cartodb.com/academic to get the education discount. You will not be able to do the necessary work in this class without an “Academy” CartoDB plan.
Read Cairo: The Functional Art , Reading part 1: pages 25-31, 36-44, on thinking through a visualization as a tool for the reader; what graphical form best serves the goal? On e-reserve in the Library
Read The Perils of Polling Twitter
Discuss homework: Problems, challenges, solutions
Discuss: Visual Encoding, Story expectations
Hands-on: Tabula and Pivot Tables
Due Week 03:
Spreadsheet exercise, Pitch your community profile, Make sure Open Refine is installed on your computer.
Hands-on: Basic HTML and working with Gists
Hands-on: Cleaning data with OpenRefine
Workshop Community Profile pitches: are these the right data points to consider? What else would you wan to know to understand this community?
Due Week 04 :
Open Refine exercise, make sure you have TextWrangler installed. Map reading
NOTE: CUNY academic calendar has us on a Monday schedule Feb 18. We’re meeting anyway.
Discussion: Asking good questions
Discussion: Limitations of maps, impact of mapping choices
Discussion: Storyboards
Hands-on: CartoDB, PostGIS, adding images to a gist
Due Week 05:
Mapping exercise, Storyboards, Readings: Steele and Iliinsky, Designing Data Visualizations Chapter 4: Choose Appropriate Visual Encodings (in Library); Cairo: The Functional Art, Reading part 2: pages 118-129, on Cleveland & McGill’s perceptual accuracy
Discussion: Review Maps, Maps as Research Hands-on: Troubleshooting map exercise; If we get there: Advanced GIS queries
Due Week 06 : Storyboard (Community); Mapping Exercise #2;
Discussion: Pitches, Visual Encoding
Hands-on: Making charts
Workshop Data driven story pitches: is this interesting? Is this the right angle? What would make this story something you’d like to read?
Due Week 07: Chart exercise, Read Cairo: The Functional Art, Reading part 3: pages 73-86, on presentation;
Hands-on: Using Bootstrap templates Hands-on: Putting a Highcharts function in an HTML page
Due Week 08:
Chart exercise 2, Storyboard (data driven)
Read selections from Tufte, Quantitative Display of Information, on e-reserve in the Library: pages 91-105, 176-190.
Hands-on: Highcharts API Hands-on: Redesign exercise
Due Week 09: Rough Draft (community)
Discussion: Polls, samples and surveys Writing questions people will answer, getting data into and out of a data store
Due Week 10: Final Story (Community), Survey excercise
Workshop: Surveys – too many questions? Too few? Would you answer these? Workshop: Community Study
Due Week 11: Revised survey exercise, dabble in the command line, focus on your data driven story
No class. Spring break.
Hands-on: Regular Expressions and CSVkit to work with big text files
Due Week 12: Rough Draft (data driven story)
Hands-on: Regular Expressions and CSVkit
Due Week 13: CSVkit exercise, Final story (data driven)
Discussion: infographics for print and web Hands on: infographic redesign
Due Week 14: Revisions to Community Study Infographic exercise
Hands on: we’ll take stock of how much we’ve learned and either go deeper on a tool you’d like more of, or tackle a new tool.
Due Week 15: Revisions to Data Driven Reporting
Discussion: closing thoughts
Fill out student evaluations