![]() xticks ( rotation = 45, ha = 'right' ) sns. swarmplot ( x = 'Manufacturer', y = 'Combined MPG', data = q75, order = ordered, color = 'red', edgecolor = 'black', linewidth = 1, size = 6 ) _ = plt. swarmplot ( x = 'Manufacturer', y = 'Combined MPG', data = q25, order = ordered, color = 'red', edgecolor = 'black', linewidth = 1, size = 6 ) sns. swarmplot ( x = 'Manufacturer', y = 'Combined MPG', data = medians, order = ordered, color = 'white', edgecolor = 'black', linewidth = 1, size = 8 ) sns. ![]() stripplot ( x = 'Manufacturer', y = 'Combined MPG', data = df_mpg_filtered, jitter = True, linewidth = 1, order = ordered ) medians = df_mpg_filtered. violinplot ( x = 'Manufacturer', y = 'Combined MPG', data = df_mpg_filtered, cut = 0, scale = 'width', inner = None, linewidth = 1, color = '#DDDDDD', saturation = 1, order = ordered, bw = 0.2 ) sns. To get a sense for how many cars each manufacturer actually makes, let's use scale=count.įig, ax = plt. The other two options are count which scales the violin by the number of data points in the category, and width, which sets the width to be constant for each violin. The default is area which makes each violin the same area. The other thing we'll adjust here is the scale. To stop the violin where the data itself stops, we can use cut=0. The violin plot creates a smooth distribution on top of the data which gives it a nice shape but might actually be a bit misleading. Now it's a bit easier to tell what's going on - Mazda has some pretty high MPG cars and a tight distribution, Mercedes makes mostly low MPG cars, and General Motors is pretty low on average but might have a few exceptions. violinplot ( x = 'Manufacturer', y = 'Combined MPG', data = df_mpg_filtered, order = ordered ) _ = plt. sort_values ( 'Combined MPG', ascending = False ). We do this using the order argument which takes a list of values in the order you'd like to see them. Let's first change the order of the violions so they're ranked, from highest to lowest, by the median MPG of the manufacturer. Customizing the Violin Plot Changing the order Let's customize it a bit and see what we can find. However, it's not that easy to see what's really going on and to draw insights out. And the distribution itself is smoothed nicely to get a general sense of what's going on in the data. It normalizes the area of all the violins themselves so you can see the underlying distribution pretty well for each category. It shows an actual box plot inside of the violin with the median as a white dot. Pretty huh? Seaborn's default violin plot is really nice. violinplot ( x = 'Manufacturer', y = 'Combined MPG', data = df_mpg_filtered ) # Rotate the x-axis labels and remove the plot border on the left. subplots ( figsize = ( 9, 6 )) # Plot our violins. set_style ( 'whitegrid' ) fig, ax = plt. # Change to a bit better style and larger figure.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |