<안내>
필자도 배우는 입장이라 틀린점, 잘못된 점이 있을 수 있습니다.
그러니 지적, 피드백 환영합니다.
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
In [4]:
egg_df = pd.read_csv('./GallusGallusDomesticus.csv')
In [6]:
egg_df
Out[6]:
GallusID | GallusBreed | Day | Age | GallusWeight | GallusEggColor | GallusEggWeight | AmountOfFeed | EggsPerDay | GallusCombType | SunLightExposure | GallusClass | GallusLegShanksColor | GallusBeakColor | GallusEarLobesColor | GallusPlumage | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Marans1 | Marans | 1 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
1 | Marans1 | Marans | 2 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
2 | Marans1 | Marans | 3 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
3 | Marans1 | Marans | 4 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
4 | Marans1 | Marans | 5 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
995 | Ameraucana11 | Ameraucana | 1 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
996 | Ameraucana11 | Ameraucana | 2 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
997 | Ameraucana11 | Ameraucana | 3 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
998 | Ameraucana11 | Ameraucana | 4 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
999 | Ameraucana11 | Ameraucana | 5 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
1000 rows × 16 columns
In [8]:
egg_df['GallusID'].unique()
Out[8]:
array(['Marans1', 'Marans2', 'Marans3', 'Marans4', 'Marans5', 'Marans6',
'Marans7', 'Marans8', 'Marans9', 'Marans10', 'Marans11',
'Marans12', 'Marans13', 'Marans14', 'Marans15', 'Marans16',
'Marans17', 'Marans18', 'Marans19', 'Marans20', 'Marans21',
'Marans22', 'Marans23', 'Marans24', 'Marans25', 'Marans26',
'Marans27', 'Marans28', 'Marans29', 'Marans30', 'Marans31',
'Marans32', 'Marans33', 'Marans34', 'Marans35', 'Marans36',
'Marans37', 'Marans38', 'Marans39', 'Marans40', 'Marans41',
'Marans42', 'Marans43', 'Marans44', 'Marans45', 'Marans46',
'Marans47', 'Marans48', 'Marans49', 'Marans50', 'Marans51',
'Marans52', 'Marans53', 'Marans54', 'Marans55', 'Marans56',
'Marans57', 'Marans58', 'Marans59', 'Marans60', 'Marans61',
'Marans62', 'Marans63', 'Marans64', 'Marans65', 'Marans66',
'Marans67', 'Marans68', 'Marans69', 'Marans70', 'Marans71',
'Marans72', 'Marans73', 'Marans74', 'Marans75', 'Marans76',
'Marans77', 'Marans78', 'Marans79', 'Marans80', 'Marans81',
'Marans82', 'Marans83', 'Marans84', 'Marans85', 'Marans86',
'Marans87', 'Marans88', 'Marans89', 'Marans90', 'Marans91',
'Marans92', 'Marans93', 'Marans94', 'Marans95', 'Marans96',
'Marans97', 'Marans98', 'Marans99', 'Marans100', 'Marans101',
'Marans102', 'Marans103', 'Marans104', 'Marans105', 'Marans106',
'Marans107', 'Marans108', 'Marans109', 'Marans110', 'Marans111',
'Marans112', 'Marans113', 'Marans114', 'Marans115', 'Marans116',
'Marans117', 'Marans118', 'Marans119', 'Marans120', 'Marans121',
'Marans122', 'Marans123', 'Marans124', 'Marans125', 'Marans126',
'Marans127', 'Marans128', 'Marans129', 'Marans130', 'Marans131',
'Marans132', 'Marans133', 'Marans134', 'Marans135', 'Marans136',
'Marans137', 'Marans138', 'Marans139', 'Marans140', 'Marans141',
'Marans142', 'Marans143', 'Marans144', 'Marans145', 'Marans146',
'Marans147', 'Marans148', 'Marans149', 'Marans150', 'Marans151',
'Marans152', 'Marans153', 'Marans154', 'Marans155', 'Marans156',
'Marans157', 'Marans158', 'Marans159', 'Marans160', 'Marans161',
'Marans162', 'Marans163', 'Marans164', 'Marans165', 'Marans166',
'Marans167', 'Marans168', 'Marans169', 'Marans170', 'Marans171',
'Marans172', 'Marans173', 'Marans174', 'Marans175', 'Marans176',
'Marans177', 'Marans178', 'Marans179', 'Ameraucana1',
'Ameraucana2', 'Ameraucana3', 'Ameraucana4', 'Ameraucana5',
'Ameraucana6', 'Ameraucana7', 'Ameraucana8', 'Ameraucana9',
'Ameraucana10', 'Ameraucana11'], dtype=object)
In [25]:
egg_df['Age'].unique()
Out[25]:
array([883, 684, 132, 226, 288, 125, 494, 984, 908, 619, 802, 725, 491,
342, 134, 833, 474, 906, 442, 233, 118, 899, 866, 444, 425, 859,
179, 469, 265, 567, 832, 460, 902, 124, 626, 825, 182, 635, 236,
904, 648, 936, 979, 863, 809, 193, 896, 769, 913, 84, 660, 699,
228, 102, 350, 407, 48, 990, 461, 664, 555, 929, 361, 761, 694,
933, 400, 76, 332, 681, 344, 617, 91, 568, 187, 840, 224, 201,
612, 594, 801, 447, 358, 692, 939, 732, 591, 589, 72, 393, 94,
294, 466, 526, 615, 55, 395, 405, 717, 24, 559, 961, 95, 206,
418, 186, 383, 200, 146, 262, 35, 147, 639, 266, 160, 707, 475,
661, 753, 853, 71, 521, 61, 117, 907, 964, 421, 845, 814, 702,
882, 92, 976, 529, 800, 250, 659, 867, 687, 488, 547, 576, 799,
161, 854, 135, 893, 700, 143, 107, 554, 467, 705, 987, 556, 858,
316, 795, 844, 514, 727, 208, 439, 204, 177, 901, 384], dtype=int64)
In [44]:
egg_df.describe()
Out[44]:
Day | Age | GallusWeight | GallusEggWeight | AmountOfFeed | EggsPerDay | SunLightExposure | |
---|---|---|---|---|---|---|---|
count | 1000.00000 | 1000.000000 | 1000.000000 | 1000.000000 | 1000.000000 | 1000.000000 | 1000.000000 |
mean | 3.25000 | 522.010000 | 2217.850000 | 43.427100 | 116.250000 | 0.965000 | 8.300000 |
std | 1.78625 | 284.765045 | 438.544409 | 7.510839 | 7.514917 | 0.183872 | 1.269493 |
min | 1.00000 | 24.000000 | 1500.000000 | 30.080000 | 100.000000 | 0.000000 | 5.000000 |
25% | 2.00000 | 246.500000 | 1840.000000 | 36.895000 | 110.000000 | 1.000000 | 7.000000 |
50% | 3.00000 | 527.500000 | 2170.000000 | 43.775000 | 116.000000 | 1.000000 | 8.000000 |
75% | 4.00000 | 796.000000 | 2640.000000 | 50.020000 | 123.000000 | 1.000000 | 9.000000 |
max | 10.00000 | 990.000000 | 3000.000000 | 58.930000 | 129.000000 | 1.000000 | 11.000000 |
In [45]:
egg_df.corr()
Out[45]:
Day | Age | GallusWeight | GallusEggWeight | AmountOfFeed | EggsPerDay | SunLightExposure | |
---|---|---|---|---|---|---|---|
Day | 1.000000 | 0.001914 | -0.082693 | 0.135436 | -0.094145 | 0.026668 | -0.077250 |
Age | 0.001914 | 1.000000 | 0.052396 | 0.077701 | 0.128792 | -0.304632 | 0.076941 |
GallusWeight | -0.082693 | 0.052396 | 1.000000 | -0.101810 | 0.128234 | 0.095273 | 0.161182 |
GallusEggWeight | 0.135436 | 0.077701 | -0.101810 | 1.000000 | -0.068021 | -0.037039 | -0.025482 |
AmountOfFeed | -0.094145 | 0.128792 | 0.128234 | -0.068021 | 1.000000 | -0.109570 | 0.007345 |
EggsPerDay | 0.026668 | -0.304632 | 0.095273 | -0.037039 | -0.109570 | 1.000000 | 0.045028 |
SunLightExposure | -0.077250 | 0.076941 | 0.161182 | -0.025482 | 0.007345 | 0.045028 | 1.000000 |
In [35]:
marans_df = egg_df[egg_df['GallusBreed']=='Marans']
marans_df
Out[35]:
GallusID | GallusBreed | Day | Age | GallusWeight | GallusEggColor | GallusEggWeight | AmountOfFeed | EggsPerDay | GallusCombType | SunLightExposure | GallusClass | GallusLegShanksColor | GallusBeakColor | GallusEarLobesColor | GallusPlumage | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Marans1 | Marans | 1 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
1 | Marans1 | Marans | 2 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
2 | Marans1 | Marans | 3 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
3 | Marans1 | Marans | 4 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
4 | Marans1 | Marans | 5 | 883 | 3000 | Brown | 41.19 | 114 | 1 | Single | 7 | Continental | White | White | NaN | Blue Copper |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
890 | Marans179 | Marans | 1 | 844 | 2710 | Brown | 34.86 | 117 | 1 | Single | 8 | Continental | White | White | NaN | Splash |
891 | Marans179 | Marans | 2 | 844 | 2710 | Brown | 34.86 | 117 | 1 | Single | 8 | Continental | White | White | NaN | Splash |
892 | Marans179 | Marans | 3 | 844 | 2710 | Brown | 34.86 | 117 | 1 | Single | 8 | Continental | White | White | NaN | Splash |
893 | Marans179 | Marans | 4 | 844 | 2710 | Brown | 34.86 | 117 | 1 | Single | 8 | Continental | White | White | NaN | Splash |
894 | Marans179 | Marans | 5 | 844 | 2710 | Brown | 34.86 | 117 | 1 | Single | 8 | Continental | White | White | NaN | Splash |
895 rows × 16 columns
In [37]:
marans_df.describe()
Out[37]:
Day | Age | GallusWeight | GallusEggWeight | AmountOfFeed | EggsPerDay | SunLightExposure | |
---|---|---|---|---|---|---|---|
count | 895.000000 | 895.000000 | 895.000000 | 895.000000 | 895.000000 | 895.000000 | 895.000000 |
mean | 3.000000 | 522.346369 | 2249.106145 | 42.561788 | 116.849162 | 0.960894 | 8.385475 |
std | 1.415004 | 287.309090 | 448.328096 | 7.291649 | 7.522869 | 0.193956 | 1.069200 |
min | 1.000000 | 24.000000 | 1500.000000 | 30.080000 | 105.000000 | 0.000000 | 7.000000 |
25% | 2.000000 | 250.000000 | 1880.000000 | 36.410000 | 110.000000 | 1.000000 | 7.000000 |
50% | 3.000000 | 554.000000 | 2210.000000 | 42.890000 | 117.000000 | 1.000000 | 8.000000 |
75% | 4.000000 | 800.000000 | 2670.000000 | 49.180000 | 123.000000 | 1.000000 | 9.000000 |
max | 5.000000 | 990.000000 | 3000.000000 | 54.980000 | 129.000000 | 1.000000 | 10.000000 |
In [46]:
marans_df.corr()
Out[46]:
Day | Age | GallusWeight | GallusEggWeight | AmountOfFeed | EggsPerDay | SunLightExposure | |
---|---|---|---|---|---|---|---|
Day | 1.000000e+00 | 3.909989e-17 | -3.969028e-17 | 1.116959e-17 | 4.321181e-17 | 1.374448e-17 | 1.502128e-17 |
Age | 3.909989e-17 | 1.000000e+00 | 4.492881e-02 | 8.251887e-02 | 1.647259e-01 | -3.196183e-01 | 4.031080e-02 |
GallusWeight | -3.969028e-17 | 4.492881e-02 | 1.000000e+00 | -3.101354e-02 | 7.529480e-02 | 1.127975e-01 | 1.298787e-01 |
GallusEggWeight | 1.116959e-17 | 8.251887e-02 | -3.101354e-02 | 1.000000e+00 | 4.097998e-02 | -6.437112e-02 | 8.987063e-02 |
AmountOfFeed | 4.321181e-17 | 1.647259e-01 | 7.529480e-02 | 4.097998e-02 | 1.000000e+00 | -9.987379e-02 | -1.779497e-02 |
EggsPerDay | 1.374448e-17 | -3.196183e-01 | 1.127975e-01 | -6.437112e-02 | -9.987379e-02 | 1.000000e+00 | 7.277204e-02 |
SunLightExposure | 1.502128e-17 | 4.031080e-02 | 1.298787e-01 | 8.987063e-02 | -1.779497e-02 | 7.277204e-02 | 1.000000e+00 |
In [47]:
ameraucana_df = egg_df[egg_df['GallusBreed']=='Ameraucana']
ameraucana_df
Out[47]:
GallusID | GallusBreed | Day | Age | GallusWeight | GallusEggColor | GallusEggWeight | AmountOfFeed | EggsPerDay | GallusCombType | SunLightExposure | GallusClass | GallusLegShanksColor | GallusBeakColor | GallusEarLobesColor | GallusPlumage | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
895 | Ameraucana1 | Ameraucana | 1 | 514 | 1740 | Light Blue | 47.12 | 114 | 1 | Pea | 6 | NaN | NaN | White | White | Black |
896 | Ameraucana1 | Ameraucana | 2 | 514 | 1740 | Light Blue | 47.12 | 114 | 1 | Pea | 6 | NaN | NaN | White | White | Black |
897 | Ameraucana1 | Ameraucana | 3 | 514 | 1740 | Light Blue | 47.12 | 114 | 1 | Pea | 6 | NaN | NaN | White | White | Black |
898 | Ameraucana1 | Ameraucana | 4 | 514 | 1740 | Light Blue | 47.12 | 114 | 1 | Pea | 6 | NaN | NaN | White | White | Black |
899 | Ameraucana1 | Ameraucana | 5 | 514 | 1740 | Light Blue | 47.12 | 114 | 1 | Pea | 6 | NaN | NaN | White | White | Black |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
995 | Ameraucana11 | Ameraucana | 1 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
996 | Ameraucana11 | Ameraucana | 2 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
997 | Ameraucana11 | Ameraucana | 3 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
998 | Ameraucana11 | Ameraucana | 4 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
999 | Ameraucana11 | Ameraucana | 5 | 384 | 1800 | Bluish-Green | 53.10 | 110 | 1 | Pea | 7 | NaN | Slate Blue | Brown | White | Blue Wheaten |
105 rows × 16 columns
In [40]:
ameraucana_df.describe()
Out[40]:
Day | Age | GallusWeight | GallusEggWeight | AmountOfFeed | EggsPerDay | SunLightExposure | |
---|---|---|---|---|---|---|---|
count | 105.000000 | 105.000000 | 105.000000 | 105.000000 | 105.000000 | 105.0 | 105.000000 |
mean | 5.380952 | 519.142857 | 1951.428571 | 50.802857 | 111.142857 | 1.0 | 7.571429 |
std | 2.883577 | 263.345577 | 198.777168 | 4.845328 | 5.154215 | 0.0 | 2.248320 |
min | 1.000000 | 177.000000 | 1720.000000 | 42.940000 | 100.000000 | 1.0 | 5.000000 |
25% | 3.000000 | 208.000000 | 1780.000000 | 47.120000 | 109.000000 | 1.0 | 6.000000 |
50% | 5.000000 | 469.000000 | 1880.000000 | 51.360000 | 112.000000 | 1.0 | 7.000000 |
75% | 8.000000 | 727.000000 | 2100.000000 | 52.240000 | 116.000000 | 1.0 | 11.000000 |
max | 10.000000 | 936.000000 | 2370.000000 | 58.930000 | 117.000000 | 1.0 | 11.000000 |
In [42]:
ameraucana_df.corr()
Out[42]:
Day | Age | GallusWeight | GallusEggWeight | AmountOfFeed | EggsPerDay | SunLightExposure | |
---|---|---|---|---|---|---|---|
Day | 1.000000 | 0.021390 | 0.031753 | -0.019761 | 0.009242 | NaN | 0.010594 |
Age | 0.021390 | 1.000000 | 0.233892 | 0.105153 | -0.316493 | NaN | 0.267494 |
GallusWeight | 0.031753 | 0.233892 | 1.000000 | -0.138104 | 0.333438 | NaN | 0.241276 |
GallusEggWeight | -0.019761 | 0.105153 | -0.138104 | 1.000000 | -0.548975 | NaN | -0.210662 |
AmountOfFeed | 0.009242 | -0.316493 | 0.333438 | -0.548975 | 1.000000 | NaN | -0.197954 |
EggsPerDay | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
SunLightExposure | 0.010594 | 0.267494 | 0.241276 | -0.210662 | -0.197954 | NaN | 1.000000 |
In [60]:
plt.scatter(x=ameraucana_df['GallusWeight'],y = ameraucana_df['GallusEggWeight'],c = 'Green')
plt.scatter(x=marans_df['GallusWeight'],y = marans_df['GallusEggWeight'],c = 'Red')
plt.show()
종에따른 닭의 무게와 달걀의 무게 관계¶
- Ameraucana종은 Marans종에 비해 닭의 무게는 덜 나가고, 달걀의 무게는 더 많이 나감
In [91]:
egg_df[[
'GallusEggColor','GallusClass','GallusLegShanksColor','GallusBeakColor','GallusEarLobesColor','GallusPlumage']]
Out[91]:
GallusEggColor | GallusClass | GallusLegShanksColor | GallusBeakColor | GallusEarLobesColor | GallusPlumage | |
---|---|---|---|---|---|---|
0 | Brown | Continental | White | White | NaN | Blue Copper |
1 | Brown | Continental | White | White | NaN | Blue Copper |
2 | Brown | Continental | White | White | NaN | Blue Copper |
3 | Brown | Continental | White | White | NaN | Blue Copper |
4 | Brown | Continental | White | White | NaN | Blue Copper |
... | ... | ... | ... | ... | ... | ... |
995 | Bluish-Green | NaN | Slate Blue | Brown | White | Blue Wheaten |
996 | Bluish-Green | NaN | Slate Blue | Brown | White | Blue Wheaten |
997 | Bluish-Green | NaN | Slate Blue | Brown | White | Blue Wheaten |
998 | Bluish-Green | NaN | Slate Blue | Brown | White | Blue Wheaten |
999 | Bluish-Green | NaN | Slate Blue | Brown | White | Blue Wheaten |
1000 rows × 6 columns
In [68]:
egg_df.groupby(['GallusBreed','GallusLegShanksColor'])['GallusBreed'].count()
Out[68]:
GallusBreed GallusLegShanksColor
Ameraucana Slate Black 30
Slate Blue 45
Marans White 895
Name: GallusBreed, dtype: int64
In [70]:
## Class 강( 종속과목강문계 에서)
egg_df.groupby(['GallusBreed','GallusClass'])['GallusBreed'].count()
Out[70]:
GallusBreed GallusClass
Marans Continental 895
Name: GallusBreed, dtype: int64
In [71]:
egg_df.groupby(['GallusBreed','GallusEarLobesColor'])['GallusBreed'].count()
Out[71]:
GallusBreed GallusEarLobesColor
Ameraucana Red 50
White 35
Name: GallusBreed, dtype: int64
In [72]:
egg_df.groupby(['GallusBreed','GallusPlumage'])['GallusBreed'].count()
Out[72]:
GallusBreed GallusPlumage
Ameraucana Black 20
Blue Wheaten 25
Brown Red 20
Buff 20
Silver 10
White 10
Marans Blue 235
Blue Copper 210
Splash 230
Splash Copper 220
Name: GallusBreed, dtype: int64
Breed 간의 차이 비교¶
In [92]:
## 닭의 종류와 EggColor
sns.catplot(x= 'GallusEggColor',col = 'GallusBreed' ,kind = 'count', data = egg_df)
plt.show()
In [93]:
## 닭의 종류와 Shanks 샹크스 아님 정강이 색깔임
sns.catplot(x= 'GallusLegShanksColor',col = 'GallusBreed' ,kind = 'count', data = egg_df)
plt.show()
In [94]:
## 닭의 종류와 beak Color 부리색
sns.catplot(x= 'GallusBeakColor',col = 'GallusBreed' ,kind = 'count', data = egg_df)
plt.show()
In [76]:
## 닭의 종류와 EarLobes 귓볼 색
sns.catplot(x= 'GallusEarLobesColor',col = 'GallusBreed' ,kind = 'count', data = egg_df)
plt.show()
In [83]:
## 닭의 종류와 Plumage 깃털층의 종류
sns.catplot(x= 'GallusPlumage',col = 'GallusBreed' ,kind = 'count', data = egg_df)
plt.show()
Breed 내에서 각 특징간 연관성 비교¶
In [99]:
## 정강이 색과 EggColor
## Maran 제외
sns.catplot(x= 'GallusEggColor',col = 'GallusLegShanksColor' ,kind = 'count', data = ameraucana_df)
plt.show()
In [98]:
## 부리색과 EggColor
## Maran 제외
sns.catplot(x= 'GallusEggColor',col = 'GallusBeakColor' ,kind = 'count', data = ameraucana_df)
plt.show()
In [81]:
## Ameraucana 종의 정강이 색과 깃털색 연관성
## Marans종은 정강이 색이 흰색으로 하나라서 제외
sns.catplot(x= 'GallusPlumage',col = 'GallusLegShanksColor' ,kind = 'count', data = ameraucana_df)
plt.show()
In [84]:
## Ameraucana 종의 귓볼색과 정강이 색의 연관성
## Marans종은 정강이 색이 흰색으로 하나라서 제외
sns.catplot(x= 'GallusEarLobesColor',col = 'GallusLegShanksColor' ,kind = 'count', data = ameraucana_df)
plt.show()
In [88]:
## Ameraucana 종의 귓볼 색과 깃털색의 연관성
## Marans 종은 귓볼데이터가 없음
sns.catplot(x= 'GallusPlumage',col = 'GallusEarLobesColor' ,kind = 'count', data = egg_df)
plt.show()
In [89]:
egg_df[egg_df['GallusPlumage'] == 'Silver']
Out[89]:
GallusID | GallusBreed | Day | Age | GallusWeight | GallusEggColor | GallusEggWeight | AmountOfFeed | EggsPerDay | GallusCombType | SunLightExposure | GallusClass | GallusLegShanksColor | GallusBeakColor | GallusEarLobesColor | GallusPlumage | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
935 | Ameraucana5 | Ameraucana | 1 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
936 | Ameraucana5 | Ameraucana | 2 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
937 | Ameraucana5 | Ameraucana | 3 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
938 | Ameraucana5 | Ameraucana | 4 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
939 | Ameraucana5 | Ameraucana | 5 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
940 | Ameraucana5 | Ameraucana | 6 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
941 | Ameraucana5 | Ameraucana | 7 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
942 | Ameraucana5 | Ameraucana | 8 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
943 | Ameraucana5 | Ameraucana | 9 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
944 | Ameraucana5 | Ameraucana | 10 | 936 | 1880 | Light Blue | 51.36 | 100 | 1 | Pea | 11 | NaN | NaN | NaN | NaN | Silver |
종마다 특징이 있는 신체부위가 있음¶
- Marans 종은 정강이가 모두 'White' 였고, 귓볼이 NA값이었고(없으니까 비운거겠지?), 깃털층은 'Blue'. 'Blue copper', 'Splash', 'Splash copper' 가 비슷한 비율로 존재 (그 중에선 'Blue'가 제일 많고, 'Blue Copper' 가 제일 적음)
- Ameraucana 종은 정강이 색이 'Slate Black'과 'Slate Blue'를 가지고, 귓볼은 'White','Red'를 가짐
legshanks에 따라서 가지는 깃털이 살짝 다르고, 귓볼은 대체적으로 'Red'가 'White'보단 많았음.
- 특이하게도 Ameraucana종에서 깃털색은 'Silver'이면서 다른 값은 전부 Na값으로 표시된 종이 있음.
닭의 무게와 먹이의 양 상관관계¶
In [138]:
## 닭의 무게와 먹이의 양
plt.xlabel('GallusWeight')
plt.ylabel('AmountOfFeed')
plt.scatter(x='GallusWeight',y = 'AmountOfFeed',color = 'green',data = marans_df)
plt.scatter(x='GallusWeight',y = 'AmountOfFeed',color = 'red',data = ameraucana_df)
plt.show()
In [102]:
## 닭의 무게와 먹이의 양
plt.scatter(x='GallusWeight',y = 'AmountOfFeed',color = 'green',data = marans_df)
plt.scatter(x='GallusWeight',y = 'AmountOfFeed',color = 'red',data = ameraucana_df)
plt.show()
닭의 나이와 / 닭의 무게, 달걀의 무게, 먹이의 양의 관계¶
In [140]:
## 닭의 나이와 닭의 무게
plt.scatter(x='Age',y = 'GallusWeight',color = 'green',data = marans_df)
plt.scatter(x='Age',y = 'GallusWeight',color = 'red',data = ameraucana_df)
plt.show()
In [141]:
## 닭의 나이와 달걀의 무게
plt.scatter(x='Age',y = 'GallusEggWeight',color = 'green',data = marans_df)
plt.scatter(x='Age',y = 'GallusEggWeight',color = 'red',data = ameraucana_df)
plt.show()
In [143]:
## 닭의 나이와 먹는 양
plt.scatter(x='Age',y = 'AmountOfFeed',color = 'green',data = marans_df)
plt.scatter(x='Age',y = 'AmountOfFeed',color = 'red',data = ameraucana_df)
plt.show()
In [120]:
## 궁금한 점
sns.catplot (x = 'GallusWeight', y = 'AmountOfFeed',hue = 'GallusBreed', kind = 'point', data = egg_df)
plt.xticks(np.arange(1500,3000,100))
plt.show()
# 하면 왜 xticks 가 잘 작동하지 않는거죠?
Abengers (저작권을 중요하게 여기는 바른 블로그인)
In [14]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%maplotlib inline
UsageError: Line magic function `%maplotlib` not found.
Core Mission¶
- 다음 질문에 답하시오.
- 캐릭터는 저마다 지능, 힘, ... 등 다양한 수치를 지니고 있다. 이러한 수치의 합이 가장 큰 캐릭터는 누구인가? 이를 보이기 위한 과정을 보여라.
- 좋은 캐릭터와 나쁜 캐릭터의 능력치들의 수치 분포를 알고 싶다. 이를 표현하기 위한 적절한 그래프를 선택해서 이를 위한 전처리를 진행하고, 시각화하여라.
Extra Mission¶
- 다음 질문에 답하시오.
- 캐릭터는 저마다 지능, 힘, ... 등 다양한 수치를 지니고 있다. 또한 각 캐릭터는 DC, 마블 등 다양한 코믹스 회사를 바탕으로 하고 있다. 어떤 코믹스 회사의 캐릭터들이 능력치 합의 평균이 가장 큰가? 이를 보이기 위한 과정을 보여라.
- 좋은 캐릭터와 나쁜 캐릭터가 격돌한다고 한다. 격돌하는 경우 캐릭터들의 능력치의 합의 평균이 큰 팀이 이긴다고 한다. 단, 불의를 못참는 중립 캐릭터들은 중립캐릭터가 없었을 당시에 열세인 팀에 가담한다. 이러한 상황일때 결과적으로 어떤 캐릭터 진영이 승리할 것인가? 이를 보이기 위한 과정을 보여라. 이 데이터를 이용해 진행하고 싶은 EDA 및 시각화가 있다면 자유롭게 진행하여라. 이 과제는 핵심 임무 이후에 진행되어야 한다.
접근법¶
- DataFrame에서 Total 값이 큰 순서대로 배치한다면 수치합이 제일 큰 캐릭터를 알 수 있음
In [10]:
abengers_df = pd.read_csv("./charcters_stats.csv")
abengers_df
Out[10]:
Name | Alignment | Intelligence | Strength | Speed | Durability | Power | Combat | Total | |
---|---|---|---|---|---|---|---|---|---|
0 | 3-D Man | good | 50 | 31 | 43 | 32 | 25 | 52 | 233 |
1 | A-Bomb | good | 38 | 100 | 17 | 80 | 17 | 64 | 316 |
2 | Abe Sapien | good | 88 | 14 | 35 | 42 | 35 | 85 | 299 |
3 | Abin Sur | good | 50 | 90 | 53 | 64 | 84 | 65 | 406 |
4 | Abomination | bad | 63 | 80 | 53 | 90 | 55 | 95 | 436 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
606 | Yellowjacket | good | 88 | 10 | 12 | 28 | 12 | 14 | 164 |
607 | Yellowjacket II | good | 50 | 10 | 35 | 28 | 31 | 28 | 182 |
608 | Ymir | good | 50 | 100 | 27 | 100 | 83 | 28 | 388 |
609 | Zatanna | good | 75 | 10 | 23 | 28 | 100 | 56 | 292 |
610 | Zoom | bad | 50 | 10 | 100 | 28 | 72 | 28 | 288 |
611 rows × 9 columns
In [13]:
abengers_df.sort_values(by= ['Total'],ascending = False)
Out[13]:
Name | Alignment | Intelligence | Strength | Speed | Durability | Power | Combat | Total | |
---|---|---|---|---|---|---|---|---|---|
361 | Martian Manhunter | good | 100 | 100 | 96 | 100 | 100 | 85 | 581 |
242 | General Zod | bad | 94 | 100 | 96 | 100 | 94 | 95 | 579 |
535 | Superboy-Prime | bad | 94 | 100 | 100 | 100 | 100 | 85 | 579 |
537 | Superman | good | 100 | 100 | 100 | 100 | 94 | 85 | 579 |
16 | Amazo | bad | 75 | 100 | 100 | 100 | 100 | 100 | 575 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
462 | Renata Soliz | good | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
137 | Captain Mar-vell | good | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
136 | Captain Epic | good | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
466 | Ripcord | good | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
335 | Lady Deathstrike | bad | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
611 rows × 9 columns
In [16]:
good_df = abengers_df[abengers_df['Alignment'] == 'good']
good_df
Out[16]:
Name | Alignment | Intelligence | Strength | Speed | Durability | Power | Combat | Total | |
---|---|---|---|---|---|---|---|---|---|
0 | 3-D Man | good | 50 | 31 | 43 | 32 | 25 | 52 | 233 |
1 | A-Bomb | good | 38 | 100 | 17 | 80 | 17 | 64 | 316 |
2 | Abe Sapien | good | 88 | 14 | 35 | 42 | 35 | 85 | 299 |
3 | Abin Sur | good | 50 | 90 | 53 | 64 | 84 | 65 | 406 |
6 | Adam Monroe | good | 63 | 10 | 12 | 100 | 71 | 64 | 320 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
604 | X-Man | good | 88 | 53 | 53 | 95 | 92 | 84 | 465 |
606 | Yellowjacket | good | 88 | 10 | 12 | 28 | 12 | 14 | 164 |
607 | Yellowjacket II | good | 50 | 10 | 35 | 28 | 31 | 28 | 182 |
608 | Ymir | good | 50 | 100 | 27 | 100 | 83 | 28 | 388 |
609 | Zatanna | good | 75 | 10 | 23 | 28 | 100 | 56 | 292 |
432 rows × 9 columns
In [18]:
bad_df = abengers_df[abengers_df['Alignment'] == 'bad']
bad_df
Out[18]:
Name | Alignment | Intelligence | Strength | Speed | Durability | Power | Combat | Total | |
---|---|---|---|---|---|---|---|---|---|
4 | Abomination | bad | 63 | 80 | 53 | 90 | 55 | 95 | 436 |
5 | Abraxas | bad | 88 | 100 | 83 | 99 | 100 | 56 | 526 |
11 | Air-Walker | bad | 50 | 85 | 100 | 85 | 100 | 40 | 460 |
16 | Amazo | bad | 75 | 100 | 100 | 100 | 100 | 100 | 575 |
17 | Ammo | bad | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
586 | Warp | bad | 38 | 10 | 23 | 28 | 63 | 50 | 212 |
590 | Weapon XI | bad | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
593 | Willis Stryker | bad | 38 | 16 | 23 | 28 | 41 | 60 | 206 |
605 | Yellow Claw | bad | 1 | 1 | 1 | 1 | 0 | 1 | 5 |
610 | Zoom | bad | 50 | 10 | 100 | 28 | 72 | 28 | 288 |
165 rows × 9 columns
In [24]:
sns.catplot(bad_df)
c:\python39\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
warnings.warn(
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_12736/2703098325.py in <module>
----> 1 sns.catplot(bad_df)
c:\python39\lib\site-packages\seaborn\_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
c:\python39\lib\site-packages\seaborn\categorical.py in catplot(x, y, hue, data, row, col, col_wrap, estimator, ci, n_boot, units, seed, order, hue_order, row_order, col_order, kind, height, aspect, orient, color, palette, legend, legend_out, sharex, sharey, margin_titles, facet_kws, **kwargs)
3790 p = _CategoricalPlotter()
3791 p.require_numeric = plotter_class.require_numeric
-> 3792 p.establish_variables(x_, y_, hue, data, orient, order, hue_order)
3793 if (
3794 order is not None
c:\python39\lib\site-packages\seaborn\categorical.py in establish_variables(self, x, y, hue, data, orient, order, hue_order, units)
154
155 # Figure out the plotting orientation
--> 156 orient = infer_orient(
157 x, y, orient, require_numeric=self.require_numeric
158 )
c:\python39\lib\site-packages\seaborn\_core.py in infer_orient(x, y, orient, require_numeric)
1309 """
1310
-> 1311 x_type = None if x is None else variable_type(x)
1312 y_type = None if y is None else variable_type(y)
1313
c:\python39\lib\site-packages\seaborn\_core.py in variable_type(vector, boolean_type)
1227
1228 # Special-case all-na data, which is always "numeric"
-> 1229 if pd.isna(vector).all():
1230 return "numeric"
1231
c:\python39\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1535 @final
1536 def __nonzero__(self):
-> 1537 raise ValueError(
1538 f"The truth value of a {type(self).__name__} is ambiguous. "
1539 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
내일 할 것 : 어벤저스 df 분석 마무리 3주차 수업 듣기
어벤져스 df에서 각각의 능력치이름를 모아서 x축에 넣고, 그 능력치의 값을 y축에 모을 생각을 해보자.
## 궁금한 점
sns.catplot (x = 'GallusWeight', y = 'AmountOfFeed',hue = 'GallusBreed', kind = 'point', data = egg_df)
plt.xticks(np.arange(1500,3000,100))
plt.show()
# 하면 왜 xticks 가 잘 작동하지 않는거죠?
'TIL > [겨울방학 부트캠프]TIL' 카테고리의 다른 글
TIL 14일차 (22.01.19) (0) | 2022.01.20 |
---|---|
TIL 13일 (22.01.18) (0) | 2022.01.18 |
TIL 11일차 (22.01.16) (0) | 2022.01.17 |
TIL 10일차 (22.01.14) (0) | 2022.01.15 |
TIL 9일차 (22.01.13) (0) | 2022.01.14 |