(It’s actually a seaborn scatter plot!)
At any time when I would like inspiration for efficient visualizations, I browse The Economist, the Visible Capitalist, or The Washington Publish. Throughout one in all these forays, I ran throughout an fascinating infographic — just like the one proven above — that plotted the age of every member of the US Congress in opposition to their generational cohort.
My first impression was that this was a horizontal bar chart, however nearer inspection revealed that every bar was composed of a number of markers, making it a scatter plot. Every marker represented one member of Congress.
On this Fast Success Information Science undertaking, we’ll recreate this enticing chart utilizing Python, pandas, and seaborn. Alongside the best way, we’ll unlock a cornucopia of marker sorts you might not know exist.
As a result of america has Age of Candidacy legal guidelines, the birthdays of members of Congress are a part of the general public report. You could find them in a number of locations, together with the Biographical Listing of america Congress and Wikipedia.
For comfort, I’ve already compiled a CSV file of the names of the present members of Congress, together with their birthdays, department of presidency, and social gathering, and saved it on this Gist.
The next code was written in Jupyter Lab and is described by cell.
Importing Libraries
from collections import defaultdict # For counting members by age.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import patches # For drawing containers on the plot.
import pandas as pd
import seaborn as sns
Assigning Constants for the Generational Information
We’ll annotate the plot in order that generational cohorts, comparable to Child Boomers and Gen X, are highlighted. The next code calculates the present age spans for every cohort and consists of lists for technology names and spotlight colours. As a result of we need to deal with these lists as constants, we’ll capitalize the names and use an underscore as a prefix.
# Put together generational information for plotting as containers on chart:
CURRENT_YEAR = 2023…