๐Ÿ“’ Today I Learn/๐Ÿ Python

[๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”] ํŒŒ์ด์ฌ์œผ๋กœ ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ : matplotlib(3) ์‹ฌํ™”

ny:D 2024. 6. 3. 01:30

๋ˆ„์  ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ

๐Ÿ’ฝ ํ™œ์šฉ ๋ฐ์ดํ„ฐ์…‹ - seaborn  Tips

tips = sns.load_dataset('tips')
tips2 = tips.groupby(['time','sex'])['tip'].mean().unstack(1)

๐Ÿ“Š ๋ˆ„์  ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ

stacked_plot

#stacked=True ๋กœ ์„ค์ •ํ•˜๋ฉด ๋ˆ„์ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 
stacked_plot= tips2.plot(kind='bar', stacked=True, color = ['#D4F0F0','#FEE1E8'])
plt.title("Average tips by time")
plt.xlabel("time")
plt.xticks(rotation = 0)
plt.ylabel("gender")
plt.legend(loc='upper left')
  • `kind = 'bar'`์˜ต์…˜์„ ์ด์šฉํ•ด ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ์„ ํƒ
  • `stacked = True`๋กœ ๋†“์œผ๋ฉด ๋ˆ„์  ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆด ์ˆ˜ ์žˆ๋‹ค.
  • ์›ํ•˜๋Š” ์ƒ‰์ƒ๊ฐ’์„ color = [ ์ƒ‰์ƒ๊ฐ’1, ์ƒ‰์ƒ๊ฐ’2]์™€ ๊ฐ™์•„ ์ง€์ •ํ•œ๋‹ค.

์ด์ค‘์ถ• ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ 

๐Ÿ’ฝ ํ™œ์šฉ ๋ฐ์ดํ„ฐ์…‹ - seaborn ์•„์ด๋ฆฌ์Šค

iris

iris = sns.load_dataset('iris')

๐Ÿ“Š ์ด์ค‘์ถ• ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ

1. ๊ธฐ๋ณธ ์Šคํƒ€์ผ ์„ค์ •

# plot ์Šคํƒ€์ผ ์ง€์ •
plt.style.use('ggplot')

# ๊ทธ๋ž˜ํ”„์— ํ‘œ์‹œ๋  ํฐํŠธ ์‚ฌ์ด์ฆˆ ์ง€์ •
plt.rcParams['font.size'] = 10

2. ๋ฐ์ดํ„ฐ ์ค€๋น„

x, y1, y2

x = iris['species'].unique()
y1 = iris.groupby('species')['sepal_length'].mean().values
y2 = iris.groupby('species')['sepal_width'].mean().values

3. ๊ธฐ๋ณธ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ

fig, ax1 = plt.subplots()

# ๊ธฐ๋ณธ Lineplot ๊ทธ๋ฆฌ๊ธฐ
ax1.plot(x, y1, '-o', color='green', markersize=7, linewidth=5, alpha=0.7, label='sepal_width')

ax1.set_ylim(0, 10) # y์ถ• ๋ฒ”์œ„ ์ง€์ •
ax1.set_xlabel('Species') # x์ถ•์ด๋ฆ„ ์ง€์ •
ax1.set_ylabel('sepal_width') # y์ถ•์ด๋ฆ„ ์ง€์ •

# ๋ˆˆ๊ธˆ ์†์„ฑ ์ง€์ •
ax1.tick_params(axis='both', direction='in')
  • ๊ธฐ๋ณธ Lineplot ๊ทธ๋ฆฌ๊ธฐ
    • '-o' : ์„  ๊ทธ๋ž˜ํ”„์— x์ถ• ์ง€ํ‘œ๋ณ„ ์  ๋„ฃ๊ธฐ
  • ๋ˆˆ๊ธˆ ์†์„ฑ ์ง€์ •ํ•˜๊ธฐ.tick_params()
    • axis = 'both' : x์ถ•๊ณผ y์ถ• ๋ชจ๋‘์˜ ๋ˆˆ๊ธˆ์— ๋Œ€ํ•ด์„œ ์„ค์ •
    • direction = 'in' : ๊ทธ๋ž˜ํ”„์˜ x์ถ•๊ณผ y์ถ• ๋ชจ๋‘์˜ ๋ˆˆ๊ธˆ์— ๋Œ€ํ•ด์„œ ์„ค์ •

4. x์ถ•๊ณต์œ 

# x์ถ• ๊ณต์œ (์ฆ‰, ์ด์ค‘์ถ• ์‚ฌ์šฉ ์˜๋ฏธ)
ax2 = ax1.twinx()

# ๋‘๋ฒˆ์งธ ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ
ax2.bar(x, y2, color='pink', label='sepal_length', alpha=0.7, width=0.7)
ax2.set_ylim(0, 10) # y์ถ• ๋ฒ”์œ„ ์„ค์ •
ax2.set_ylabel('sepal_length')

# ๋ˆˆ๊ธˆ ์†์„ฑ ์ง€์ •
ax2.tick_params(axis='y', direction='in')

5. ๋ฒ”๋ก€ ๋ฐ ๋ ˆ์ด๋ธ” ํ‘œ์‹œ

# ๋ฒ”๋ก€ ํ‘œ์‹œ
lines, labels = ax1.get_legend_handles_labels() 
bars, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines + bars, labels + labels2, loc='upper left')

# ๊ทธ๋ž˜ํ”„ ์ œ๋ชฉ ์„ค์ •
plt.title('Sepal Width and Length by Species', size=12)
plt.show()
  • .get_legend_handles_labels() : ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•œ ๋ฒ”๋ก€(legend) ํ•ธ๋“ค๊ณผ ๋ ˆ์ด๋ธ”์„ ๊ฐ€์ ธ์˜จ๋‹ค.
  • .legend(๋ฒ”๋ก€ํ•ธ๋“ค1 + ๋ฒ”๋ก€ํ•ธ๋“ค2, ๋ ˆ์ด๋ธ”1 + ๋ ˆ์ด๋ธ”2, loc = '์œ„์น˜') : ๋‘๊ฐœ ๋ ˆ์ด๋ธ”์„ ํ•ฉ์ณ์„œ ํ•˜๋‚˜์˜ ๋ฒ”๋ก€๋กœ ํ‘œ์‹œ

ํ”ผ๋ผ๋ฏธ๋“œ ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ 

๐Ÿ’ฝ ํ™œ์šฉ ๋ฐ์ดํ„ฐ์…‹ - seaborn  Titanic

titanic

titanic = sns.load_dataset("titanic")

# ๋‚˜์ด ํ˜•ํƒœ ๊ตฌ๊ฐ„ ์„ค์ •์œผ๋กœ ๋ณ€๊ฒฝ
bins = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80]
bin_labels = [10,15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80]

# cut ํ™œ์šฉ ์ ˆ๋Œ€๊ตฌ๊ฐ„ ๋‚˜๋ˆ„๊ธฐ 
titanic['bin'] = pd.cut(titanic['age'], bins = bins)

titanic['age'] = titanic['bin'].map(lambda x: str(x.left) + " - " + str(x.right))

# groupby ๋ฐ ํ”ผ๋ฒ—์œผ๋กœ ๊ทธ๋ฃนํ™”
titanic = titanic.groupby(['age','sex'])['bin'].count().reset_index()
titanic =  pd.pivot_table(titanic, index='age', columns='sex', values='bin').reset_index()

๐Ÿ“Š ํ”ผ๋ผ๋ฏธ๋“œ ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ

1. ํ”ผ๋ผ๋ฏธ๋“œ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฌ๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๊ฐ’์„ ๊ณ„์‚ฐํ•œ๋‹ค.

titanic["Female_Left"] = 0
titanic["Female_Width"] = titanic["female"]
titanic["Male_Left"] = -titanic["male"]
titanic["Male_Width"] = titanic["male"]
titanic

2. ๋ฐ”(bar)๋ฅผ ํ‘œ์‹œํ•ด์ค€๋‹ค.

dplot6 = plt.figure(figsize=(7,5))

# ์–‘์ชฝ์— ๋ง‰๋Œ€(bar)๊ทธ๋ฆฌ๊ธฐ
plt.barh(y=titanic["age"], width=titanic["Female_Width"], color="#ED553B", label="Female")
plt.barh(y=titanic["age"], width=titanic["Male_Width"], left=titanic["Male_Left"],color="#3CAEA3", label="Male")

# ๊ทธ๋ž˜ํ”„ ๋ฒ”์œ„ ์ œํ•œํ•˜๊ธฐ
plt.xlim(-100,100)
plt.ylim(-2,15)

# ์–‘์ชฝ ์˜์—ญ์— ๋ฒ”๋ก€๋ฅผ ํ…์ŠคํŠธ๋กœ ํ‘œํ˜„ํ•˜๊ธฐ
plt.text(-60, 13.7, "Male", fontsize=10, fontweight="bold")
plt.text(50, 13.7, "Female", fontsize=10, fontweight="bold")

3. ๊ฐ๊ฐ์˜ ๋ฐ์ดํ„ฐ๊ฐ’์— ๋Œ€ํ•œ ๋ ˆ์ด๋ธ”๋ง์„ ์ง„ํ–‰ํ•œ๋‹ค.

for idx in range(len(titanic)):
    plt.text(x=titanic["Male_Left"][idx]-0.5, y=idx, s="{}".format(titanic['male'][idx]),
             ha="right", va="center",
             fontsize=8, color="#3CAEA3")
    plt.text(x=titanic["Female_Width"][idx]+0.5, y=idx, s="{}".format(titanic["female"][idx]),
             ha="left", va="center",
             fontsize=8, color="#ED553B")

4. ๊ทธ๋ž˜ํ”„ ์ œ๋ชฉ์„ ์„ค์ •ํ•œ๋‹ค.

plt.title("Pyramid plot", loc="center", pad=15, fontsize=15, fontweight="bold")

์—ฌ๋Ÿฌ๊ฐœ์˜ ๊ทธ๋ž˜ํ”„ ํ•œ๋ฒˆ์— ๊ทธ๋ฆฌ๊ธฐ

๐Ÿ’ฝ ํ™œ์šฉ ๋ฐ์ดํ„ฐ์…‹ - seaborn  Penguins

penguins

penguins = sns.load_dataset('penguins')
penguins.head()

๐Ÿ“Š ์—ฌ๋Ÿฌ๊ฐœ์˜ ๊ทธ๋ž˜ํ”„ ํ•œ๋ฒˆ์— ๋„ฃ๊ธฐ

# figure์™€ ax ๋งŒ๋“ค๊ธฐ
fig, ax = plt.subplots(2, 2)

# ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ
ax[0, 0].hist(penguins['body_mass_g'])
ax[0, 1].boxplot(penguins.groupby('sex')['bill_depth_mm'].apply(list))
ax[1, 0].plot(penguins.groupby('species')['bill_length_mm'].mean())
ax[1, 1].scatter(penguins['bill_length_mm'],penguins['bill_depth_mm'])

# ๊ทธ๋ž˜ํ”„ ์‚ฌ์ด ๊ฐ„๊ฒฉ ์ถ”๊ฐ€
fig.subplots_adjust(hspace=0.5, wspace=0.3)

# ๊ทธ๋ž˜ํ”„๋ณ„ ํƒ€์ดํ‹€ ์ถ”๊ฐ€
ax[0, 0].set_title("hist plot", fontsize=9) 
ax[0, 1].set_title("box plot", fontsize=9)  
ax[1, 0].set_title("line plot", fontsize=9)  
ax[1, 1].set_title("scatter plot", fontsize=9)  

plt.show()
  • subplots(2,2) → 2 x 2 ์ด 4๊ฐœ์˜ ์นธ์„ ๋งŒ๋“ค๊ฒ ๋‹ค๊ณ  ์„ ์–ธํ•˜๋Š” ๊ฒƒ (ํ”„๋ ˆ์ž„ ๋งŒ๋“ค๊ธฐ)
  • ์œ„์น˜ ์ขŒํ‘œ์— ๋”ฐ๋ผ plot์„ ํ• ๋‹นํ•˜๊ธฐ
  • subplots_adjust๋ฅผ ์ด์šฉํ•ด ๊ทธ๋ž˜ํ”„ ์‚ฌ์ด ๊ฐ„๊ฒฉ์„ ๋„ฃ์–ด์ฃผ๊ธฐ
  • ๊ทธ๋ž˜ํ”„๋ณ„ ํƒ€์ดํ‹€ ์ถ”๊ฐ€ํ•˜๊ธฐ : ์œ„์น˜์ขŒํ‘œ.set_title('ํƒ€์ดํ‹€๋ช…', fontsize='๊ธ€์žํฌ๊ธฐ์ง€์ •)