python - How do I shift categorical scatter markers to left and right above xticks (multiple data sets per category)? -
i have simple pandas dataframe want plot matplotlib:
import pandas pd import matplotlib.pyplot plt df = pd.read_excel('sat_data.xlsx', index_col = 'state') plt.figure() plt.scatter(df['year'], df['reading'], c = 'blue', s = 25) plt.scatter(df['year'], df['math'], c = 'orange', s = 25) plt.scatter(df['year'], df['writing'], c = 'red', s = 25)
here plot looks like:
i'd shift blue data points bit left, , red ones bit right, each year on x-axis has 3 mini-columns of scatter data above instead of 3 datasets overlapping. tried , failed use 'verts' argument properly. there better way this?
a quick , dirty way create small offset dx
, subtract x
values of blue points , add x
values of red points.
dx = 0.1 plt.scatter(df['year'] - dx, df['reading'], c = 'blue', s = 25) plt.scatter(df['year'], df['math'], c = 'orange', s = 25) plt.scatter(df['year'] + dx, df['writing'], c = 'red', s = 25)
one more option use stripplot function seaborn
library. necessary melt original dataframe long form each row contains year, test , score. make stripplot
specifying year x
, score y
, test hue
. split
keyword argument controls plotting categories separate stripes each x
. there's jitter
argument add noise x
values take small area instead of being on single vertical line.
import pandas pd import seaborn sns # make example data np.random.seed(2017) df = pd.dataframe(columns = ['reading','math','writing'], data = np.random.normal(540,30,size=(1000,3))) df['year'] = np.random.choice(np.arange(2006,2016),size=1000) # melt data long form df1 = pd.melt(df, var_name='test', value_name='score',id_vars=['year']) # make stripplot fig, ax = plt.subplots(figsize=(10,7)) sns.stripplot(data = df1, x='year', y = 'score', hue = 'test', jitter = true, split = true, alpha = 0.7, palette = ['blue','orange','red'])
output:
Comments
Post a Comment