`1`. 다음은 이미지와 대응하는 히스토그램을 나타낸것이다. 이미지와 히스토그램을 올바르게 짝지어라. (50점)

예시 a-c, b-d

Answer a-c, b-d

`2`. 주어진 자료를 바탕으로 예시와 같은 시각화를 구현하라. (5점)

자료

x=[1,2,3,4]
y=[1,2,4,3]

시각화예시

x=[1,2,3,4]
y=[1,2,4,3]
fig,((ax1,ax2),(ax3,ax4))=plt.subplots(2,2)
ax1.plot(x,y,'--bo')
ax2.plot(x,y,'or')
ax3.plot(x,y,'xk')
ax4.plot(x,y,'--m.')

[<matplotlib.lines.Line2D at 0x7feeeb4c71c0>]

`3`. 아래는 앤스콤의 플랏이다. 옳게 해석한 사람을 모두 고르라 (5점)

(하니) 그림 (a)-(d)는 모두 양의 상관계수를 가진다.

(나애리) 그림 (b)는 산점도가 직선이 아니라 곡선의 모양을 띄고 있으므로 상관계수는 0이다.

(홍두깨) 그림 (c)에서 상단의 이상치를 제외하면 상관계수는 1이다.

(고은애) 그림 (d)의 우측 이상치의 값을 적절하게 바꾸면 (d)의 상관계수를 음수로 만드는 것이 가능하다.

(이창수) 그림 (c)역시 상단의 이상치 값을 적절하게 바꾸면 (c)의 상관계수를 음수로 만드는 것이 가능하다.

Anwer: 하니, 홍두꺠, 고은애, 이창수

`4`. 다음은 아이스크림 소비량과 소아마비의 관계를 그린 산점도이다. 이때 색깔은 온도가 비슷한 관측치로 그룹핑되었다. 옳은 해석을 모두 골라라. (10점)

(하니) 아이스크림과 소아마비는 양의 상관관계에 있다.

(나애리) 상관계수의 값이 1에 가까울수록 아이스크림과 소아마비의 인과성이 명확하다고 볼 수 있다.

(홍두깨) 비슷한 온도를 가진 관측치에서는 아이스크림과 소아마비의 상관계수가 0에 가깝다.

(고은애) 온도를 통제하였을 경우 아이스크림과 소아마비의 상관계수가 0이므로 둘 사이의 인과성이 있다고 보긴 어렵다. (단, 온도를 통제하였을 경우에는 아이스크림은 랜덤으로 먹었다고 가정한다.)

Answer: 하니, 홍두깨, 고은애

`5`. FIFA22 (100점)

아래의 코드를 활용하여 FIFA22의 자료를 불러온뒤 물음에 답하라.

df=pd.read_csv('https://raw.githubusercontent.com/guebin/2021DV/master/_notebooks/2021-10-25-FIFA22_official_data.csv')

import pandas as pd
df=pd.read_csv('https://raw.githubusercontent.com/guebin/2021DV/master/_notebooks/2021-10-25-FIFA22_official_data.csv')

(a) `Loaned From`,`Marking` 열을 선택하는 코드를 작성하고 값을 확인하라.

df.loc[:,['Loaned From','Marking']]

(b) 기존의 데이터프레임에서 `Loaned From`, `Marking`열을 제외하는 코드를 작성하라.

내 Answer

df.drop(['Loaned From','Marking'],axis=1)

교수님 Answer

df.iloc[:,map(lambda x: 'Loaned From' != x and 'Marking' != x ,df.columns )]

(c) (b)의 결과에 `.dropna()`를 사용하여 결측치를 제거하는 코드를 작성하라. 몇개의 결측치가 제거되었는가?

내 Answer

len(df.drop(['Loaned From','Marking'],axis=1))-len(df.drop(['Loaned From','Marking'],axis=1).dropna())

2312

교수님 Answer

df.iloc[:,map(lambda x: 'Loaned From' != x and 'Marking' != x ,df.columns )].dropna()

print(str(16710-14398),'개가 제거되었다.')

2312 개가 제거되었다.

(d) (c)의 결과에 아래의 코드를 활용하여 `Wage`의 값을 적절하게 변환하라.

### 코드1 
def convert_currency(value):
    floatvalue = 0.0
    strvalue=""
    if "M" in value:
        strvalue=value.replace("M","").replace("€","")
        floatvalue=float(float(strvalue)*1000000)
    elif "K" in value:
        strvalue=value.replace("K","").replace("€","")
        floatvalue=float(float(strvalue)*1000)
    else:
        floatvalue=value.replace("€","")
    return floatvalue

코드출처: https://www.kaggle.com/talhademirezen/cost-effective-youth-players-fifa22

교수님 Answer

def convert_currency(value):
    floatvalue = 0.0
    strvalue=""
    if "M" in value:
        strvalue=value.replace("M","").replace("€","")
        floatvalue=float(float(strvalue)*1000000)
    elif "K" in value:
        strvalue=value.replace("K","").replace("€","")
        floatvalue=float(float(strvalue)*1000)
    else:
        floatvalue=value.replace("€","")
    return floatvalue

df=df.iloc[:,map(lambda x: 'Loaned From' != x and 'Marking' != x ,df.columns )].dropna()
df['Wage']=list(map(convert_currency,df.Wage))

해당 함수의 경우 원하는 변수에 중복 적용시키면 에러뜬다!

(e) 아래의 세부사항에 맞춰서 `Best Position`에 따른 시장가치(`Value`)의 평균을 barplot을 이용하여 시각화 하라.

x축을 Best Position으로 하고 y축은 Value의 평균으로 할 것
Value가 가장 높은 3개의 포지션을 다른색으로 하이라이팅 할 것

시각화예시

교수님 Answer

df['Value']=list(map(convert_currency,df.Value))

import numpy as np

_df=df.groupby('Best Position').agg({'Value':np.mean})\
.rename(columns={'Value':'mean(Value)'})\
.sort_values('mean(Value)',ascending=False)\
.reset_index()
_df['Highlight']=_df['mean(Value)']>=_df['mean(Value)'][2]
_df

from plotnine import *

ggplot(_df)+geom_bar(aes(x='Best Position',y='mean(Value)',fill='Highlight'),stat='identity')

<ggplot: (8791507800552)>

내 Answer

df3=df.groupby(by='Best Position').agg({'Value':np.mean}).rename(columns={'Value':'mean(Value)'}).reset_index()
def f(x): 
    if x=='CF':y='True'
    elif x=='CM':y='True'
    elif x=='LW':y='True'
    else: y='False'
    return y 
df3['Bp']=df3['Best Position']
df3['Hightlight']=list(map(f,df3.Bp))
ggplot(df3)+geom_bar(aes(x='Best Position',y='mean(Value)',color='Hightlight',fill='Hightlight'),stat='identity')

<ggplot: (8791507709085)>

(f) 아래의 세부사항에 맞추어 (`Dribbling`,`SlidingTackle`)의 산점도를 그려라.

세부사항

(i) Best Position의 값을 바탕으로 면분할을 하라.

(ii) Age를 색으로 표현하라.

(iii) 산점도의 투명도는 alpha=0.5로 size=0.5로 설정할 것.

ggplot(df)\
+geom_point(aes(x='Dribbling',y='SlidingTackle',color='Age'),alpha=0.5,size=0.5)\
+facet_wrap('Best Position')

<ggplot: (8777787744886)>

(g) (f)의 그림을 올바르게 해석한 사람은?

(하니) 포지션 GK에 있는 선수는 Dribbling, SlidingTackle 값이 다른포지션대비 상대적으로 낮다.

(나애리) 모든 포지션에서 Dribbling, SlidingTackle은 서로 독립이라 볼 수 있다.

(홍두깨) 포지션 CAM은 나이와 Dribbling 사이에 양의 상관관계에 있다.

(고은애) 포지션 CB은 나이와 Dribbling 사이에 상관계수가 거의 0이다.

Answer: 하니, 홍두깨, 고은애

(h) `Best Position`이 "CAM" 혹은 "CB"인 플레이어만 골라서 (`Dribbling`,`SlidingTackle`)의 산점도를 그려라.

세부사항

x축: Dribbling, y축: SlidingTackle 로 설정
Value에 따라 점의 크기를 다르게 설정
나이에 따라 색깔을 다르게 설정

시각화예시

교수님 Answer

ggplot(df.loc[(df['Best Position']=='CAM') |(df['Best Position']=='CB')])\
+geom_point(aes(x='Dribbling',y='SlidingTackle',color='Age',size='Value'),alpha=0.5)\
+facet_wrap('Best Position')

<ggplot: (8777787037434)>

내 Answer

웬만하면 객체에 저장하지 말 것. 파이썬의 경우 데이터가 독립적으로 존재하지 않으려고 하는 경향이 있어서(그래서 빠르기도 함) 잘못하면 원본 데이터가 변할 가능성이 존재한다!

a=df.groupby(by='Best Position').get_group('CAM')
b=df.groupby(by='Best Position').get_group('CB')
c=pd.concat([a,b])
ggplot(c)+geom_point(aes(x='Dribbling',y='SlidingTackle',color='Age',size='Value'),alpha=0.3)+facet_wrap('Best Position')

<ggplot: (8791508078189)>

(i) 그림 (h)를 올바르게 해석한것을 모두 고르라.

(하니) AGE와 Value는 양의 상관관계에 있다.

(나애리) 따라서 축구선수는 AGE가 증가함 따라 Value가 올라가는 것을 알 수 있다. 즉 AGE와 Value사이에는 인과성이 있다.

(홍두깨) 포지션 CAM은 Dribbiling 능력과 Value가 양의 상관관계에 있어보인다.

(고은애) 반면에 포지션 CB는 Dribbling 능력보다는 SlidingTackle이 Value와 양의 상관관계에 있다고 볼 수 있다.

Answer: 하니, 홍두깨, 고은애

`6`. 하니의 산책경로 (40점)

공원에서 뛰는것을 좋아하는 강아지 하니가 있다. 아래는 강아지 하니가 주인과 함께 공원을 산책한 경로이다. 산책코스는 아래와 같이 집에서 공원으로 가는 A코스와 공원에서 집으로 오는 B코스로 나누어진다.

- A코스: 집 $\to$ 카페 $\to$ 초등학교 정문 $\to$ 공원

- B코스: 공원 $\to$ 초등학교 후문 $\to$ 동물병원 $\to$ 집

각 위치의 좌표는 아래와 같다.

집: (0,0)
카페: (1,2)
초등학교 정문: (4,3)
공원: (5,5)
초등학교 후문: (4.1,3)
동물병원: (1,0.5)

집에서 출발시에 하니의 체력은 100이며, 각 중간지점에서 하니의 체력은 이동거리에 비례하여 감소한다고 하자. 예를들어 A코스-카페에서 하니의 체력은 아래와 같이 계산할 수 있다.

$100- \sqrt{1^2+2^2}$

또한 하니는 공원정문에서 달리기를 시작하였고 이후에 60의 체력을 소진한뒤 공원후문에 도착하였다고 하자. (즉 공원정문에서하니의 체력이 $x$ 라면 공원후문에서 하니의 체력은 $x-60$ 이다.)

하니의 이동경로에 따른 체력의 변화를 시각화 하라.

(풀이)

내 Answer

x=[0,1,4,5,5,4.1,1,0]
y=[0,2,3,5,5,3,0.5,0]

def inc(x,y): return x**2+y**2

def f(x,y):
    if x=='B':
        y=y-70
    elif x=='A':
        y=y
    return y

g=np.sqrt(np.array(list(map(inc,x,y))))
g1=100-np.array(g.cumsum())
course=['A']*4+['B']*4

df1=pd.DataFrame({'x':x,'y':y,'g1':g1,'course':course})
df1

def f(x,y):
    if x=='B':
        y=y-70+50**0.5
    elif x=='A':
        y=y
    return y

df1['stamina']=list(map(f,df1.course,df1.g1))
df1

ggplot(df1)+geom_line(aes(x='x',y='y',size='stamina',color='course'))+geom_point(aes(x='x',y='y'))

<ggplot: (8791507421365)>

교수님 Answer

x=[0, 1, 4, 5,  5, 4.1, 1, 0] 
y=[0, 2, 3, 5,  5, 3, 0.5, 0] 
course=['A']*4 + ['B']*4 
_delta=[np.sqrt((x[i]-x[i-1])**2+(y[i]-y[i-1])**2)for i in range(len(x))]
stamina = np.array([100,100,100,100, 40, 40, 40, 40]) - [sum(_delta[:(i+1)]) for i in range(8)]

ggplot(pd.DataFrame({'x':x, 'y':y, 'course':course, 'stamina':stamina}))+\
geom_point(aes(x='x',y='y'))+\
geom_line(aes(x='x',y='y',size='stamina',color='course'))

<ggplot: (8777787037395)>

`7`. 빈칸에 적절한 값을 채워 심슨의 역설을 설명하는 예제를 완성하고 시각화 하라. (30점)

다음은 농구선수 A,B의 시즌별 자유투 성공률이다.

	시즌1	시즌2
A선수	7/10	999999/1000000
B선수	8/10	4/4

표안의 값은 성공횟수/총자유투시도

?에 적절한 값을 채워 시즌 1,2 모두 B선수의 자유투 성공률이 높지만 시즌1-2를 전체 합치면 A선수의 자유투 성공률이 더 높도록 하라. (즉 ?에 적절한 값을 채워 심슨의 역설을 설명하기 위한 자료를 구성하라.)

만들어진 자료를 바탕으로 심슨의 역설을 시각화하라. (즉 시즌 1,2의 자유투 성공률과 전체 자유투 성공률을 barplot으로 시각화하라)

ref: https://books.google.co.kr/books?id=qy4iEAAAQBAJ&pg=PT87&lpg=PT87&dq=%EC%95%84%EC%9D%B4%EC%8A%A4%ED%81%AC%EB%A6%BC%EC%9D%84+%EB%A7%8E%EC%9D%B4+%EB%A8%B9%EC%9C%BC%EB%A9%B4+%EA%B1%B8%EB%A6%AC%EB%8A%94+%EB%B3%91+%EC%86%8C%EC%95%84%EB%A7%88%EB%B9%84&source=bl&ots=V9B7ZG6oR-&sig=ACfU3U0UMd4ehuRXYxI69TT6lIlU-r91bA&hl=en&sa=X&ved=2ahUKEwj13JSV19LzAhVEGaYKHdgfDgcQ6AF6BAgCEAM#v=onepage&q=%EC%95%84%EC%9D%B4%EC%8A%A4%ED%81%AC%EB%A6%BC%EC%9D%84%20%EB%A7%8E%EC%9D%B4%20%EB%A8%B9%EC%9C%BC%EB%A9%B4%20%EA%B1%B8%EB%A6%AC%EB%8A%94%20%EB%B3%91%20%EC%86%8C%EC%95%84%EB%A7%88%EB%B9%84&f=false

내 Answer

season=(['season1']*2+['season2']*2+['season1']*2+['season2']*2)
player=['A']*4+['B']*4
STATE=['WIN','LOSE']*4
COUNT=[7,3,999999,1,8,2,4,0]
df=pd.DataFrame({'season':season,'STATE':STATE,'player':player,'COUNT':COUNT})
df

_df1=df.groupby(by=['player','STATE']).agg({'COUNT':np.sum}).reset_index()
_df2=df.groupby(by='player').agg({'COUNT':np.sum}).reset_index().rename(columns={'COUNT':'SUM'})
td=pd.merge(_df1,_df2)
td['PROP']=td.COUNT/td.SUM
ggplot(td.query('STATE=="WIN"'))+geom_bar(aes(x='player',y='PROP',fill='player'),stat='identity')

<ggplot: (8791507122740)>

td=df.groupby(['season','player']).agg({'COUNT':sum}).reset_index().rename(columns={'COUNT':'SUM'}).merge(df)
td['PROP']=td.COUNT/td.SUM
ggplot(td.query('STATE=="WIN"'))+geom_bar(aes(x='player',y='PROP',fill='player'),stat='identity')+facet_wrap('season')

<ggplot: (8791507113397)>

	Loaned From	Marking
0	NaN	NaN
1	NaN	NaN
2	NaN	NaN
3	NaN	NaN
4	NaN	NaN
...	...	...
16705	NaN	5.0
16706	NaN	NaN
16707	NaN	NaN
16708	NaN	NaN
16709	NaN	15.0

	ID	Name	Age	Photo	Nationality	Flag	Overall	Potential	Club	Club Logo	...	SlidingTackle	GKDiving	GKHandling	GKKicking	GKPositioning	GKReflexes	Best Position	Best Overall Rating	Release Clause	DefensiveAwareness
0	212198	Bruno Fernandes	26	https://cdn.sofifa.com/players/212/198/22_60.png	Portugal	https://cdn.sofifa.com/flags/pt.png	88	89	Manchester United	https://cdn.sofifa.com/teams/11/30.png	...	65.0	12.0	14.0	15.0	8.0	14.0	CAM	88.0	€206.9M	72.0
1	209658	L. Goretzka	26	https://cdn.sofifa.com/players/209/658/22_60.png	Germany	https://cdn.sofifa.com/flags/de.png	87	88	FC Bayern München	https://cdn.sofifa.com/teams/21/30.png	...	77.0	13.0	8.0	15.0	11.0	9.0	CM	87.0	€160.4M	74.0
2	176580	L. Suárez	34	https://cdn.sofifa.com/players/176/580/22_60.png	Uruguay	https://cdn.sofifa.com/flags/uy.png	88	88	Atlético de Madrid	https://cdn.sofifa.com/teams/240/30.png	...	38.0	27.0	25.0	31.0	33.0	37.0	ST	88.0	€91.2M	42.0
3	192985	K. De Bruyne	30	https://cdn.sofifa.com/players/192/985/22_60.png	Belgium	https://cdn.sofifa.com/flags/be.png	91	91	Manchester City	https://cdn.sofifa.com/teams/10/30.png	...	53.0	15.0	13.0	5.0	10.0	13.0	CM	91.0	€232.2M	68.0
4	224334	M. Acuña	29	https://cdn.sofifa.com/players/224/334/22_60.png	Argentina	https://cdn.sofifa.com/flags/ar.png	84	84	Sevilla FC	https://cdn.sofifa.com/teams/481/30.png	...	82.0	8.0	14.0	13.0	13.0	14.0	LB	84.0	€77.7M	80.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
16705	240558	18 L. Clayton	17	https://cdn.sofifa.com/players/240/558/18_60.png	England	https://cdn.sofifa.com/flags/gb-eng.png	53	70	Cheltenham Town	https://cdn.sofifa.com/teams/1936/30.png	...	12.0	55.0	54.0	52.0	50.0	59.0	GK	52.0	€238K	NaN
16706	262846	�. Dobre	20	https://cdn.sofifa.com/players/262/846/22_60.png	Romania	https://cdn.sofifa.com/flags/ro.png	53	63	FC Academica Clinceni	https://cdn.sofifa.com/teams/113391/30.png	...	12.0	57.0	52.0	53.0	48.0	58.0	GK	53.0	€279K	5.0
16707	241317	21 Xue Qinghao	19	https://cdn.sofifa.com/players/241/317/21_60.png	China PR	https://cdn.sofifa.com/flags/cn.png	47	60	Shanghai Shenhua FC	https://cdn.sofifa.com/teams/110955/30.png	...	9.0	49.0	48.0	45.0	38.0	52.0	GK	47.0	€223K	21.0
16708	259646	A. Shaikh	18	https://cdn.sofifa.com/players/259/646/22_60.png	India	https://cdn.sofifa.com/flags/in.png	47	67	ATK Mohun Bagan FC	https://cdn.sofifa.com/teams/113146/30.png	...	13.0	49.0	41.0	39.0	45.0	49.0	GK	47.0	€259K	7.0
16709	178453	07 A. Censori	17	https://cdn.sofifa.com/players/178/453/07_60.png	Italy	https://cdn.sofifa.com/flags/it.png	28	38	Arezzo	https://cdn.sofifa.com/teams/110907/30.png	...	NaN	7.0	1.0	36.0	6.0	9.0	ST	36.0	NaN	NaN

	ID	Name	Age	Photo	Nationality	Flag	Overall	Potential	Club	Club Logo	...	SlidingTackle	GKDiving	GKHandling	GKKicking	GKPositioning	GKReflexes	Best Position	Best Overall Rating	Release Clause	DefensiveAwareness
0	212198	Bruno Fernandes	26	https://cdn.sofifa.com/players/212/198/22_60.png	Portugal	https://cdn.sofifa.com/flags/pt.png	88	89	Manchester United	https://cdn.sofifa.com/teams/11/30.png	...	65.0	12.0	14.0	15.0	8.0	14.0	CAM	88.0	€206.9M	72.0
1	209658	L. Goretzka	26	https://cdn.sofifa.com/players/209/658/22_60.png	Germany	https://cdn.sofifa.com/flags/de.png	87	88	FC Bayern München	https://cdn.sofifa.com/teams/21/30.png	...	77.0	13.0	8.0	15.0	11.0	9.0	CM	87.0	€160.4M	74.0
2	176580	L. Suárez	34	https://cdn.sofifa.com/players/176/580/22_60.png	Uruguay	https://cdn.sofifa.com/flags/uy.png	88	88	Atlético de Madrid	https://cdn.sofifa.com/teams/240/30.png	...	38.0	27.0	25.0	31.0	33.0	37.0	ST	88.0	€91.2M	42.0
3	192985	K. De Bruyne	30	https://cdn.sofifa.com/players/192/985/22_60.png	Belgium	https://cdn.sofifa.com/flags/be.png	91	91	Manchester City	https://cdn.sofifa.com/teams/10/30.png	...	53.0	15.0	13.0	5.0	10.0	13.0	CM	91.0	€232.2M	68.0
4	224334	M. Acuña	29	https://cdn.sofifa.com/players/224/334/22_60.png	Argentina	https://cdn.sofifa.com/flags/ar.png	84	84	Sevilla FC	https://cdn.sofifa.com/teams/481/30.png	...	82.0	8.0	14.0	13.0	13.0	14.0	LB	84.0	€77.7M	80.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
16705	240558	18 L. Clayton	17	https://cdn.sofifa.com/players/240/558/18_60.png	England	https://cdn.sofifa.com/flags/gb-eng.png	53	70	Cheltenham Town	https://cdn.sofifa.com/teams/1936/30.png	...	12.0	55.0	54.0	52.0	50.0	59.0	GK	52.0	€238K	NaN
16706	262846	�. Dobre	20	https://cdn.sofifa.com/players/262/846/22_60.png	Romania	https://cdn.sofifa.com/flags/ro.png	53	63	FC Academica Clinceni	https://cdn.sofifa.com/teams/113391/30.png	...	12.0	57.0	52.0	53.0	48.0	58.0	GK	53.0	€279K	5.0
16707	241317	21 Xue Qinghao	19	https://cdn.sofifa.com/players/241/317/21_60.png	China PR	https://cdn.sofifa.com/flags/cn.png	47	60	Shanghai Shenhua FC	https://cdn.sofifa.com/teams/110955/30.png	...	9.0	49.0	48.0	45.0	38.0	52.0	GK	47.0	€223K	21.0
16708	259646	A. Shaikh	18	https://cdn.sofifa.com/players/259/646/22_60.png	India	https://cdn.sofifa.com/flags/in.png	47	67	ATK Mohun Bagan FC	https://cdn.sofifa.com/teams/113146/30.png	...	13.0	49.0	41.0	39.0	45.0	49.0	GK	47.0	€259K	7.0
16709	178453	07 A. Censori	17	https://cdn.sofifa.com/players/178/453/07_60.png	Italy	https://cdn.sofifa.com/flags/it.png	28	38	Arezzo	https://cdn.sofifa.com/teams/110907/30.png	...	NaN	7.0	1.0	36.0	6.0	9.0	ST	36.0	NaN	NaN

	ID	Name	Age	Photo	Nationality	Flag	Overall	Potential	Club	Club Logo	...	SlidingTackle	GKDiving	GKHandling	GKKicking	GKPositioning	GKReflexes	Best Position	Best Overall Rating	Release Clause	DefensiveAwareness
0	212198	Bruno Fernandes	26	https://cdn.sofifa.com/players/212/198/22_60.png	Portugal	https://cdn.sofifa.com/flags/pt.png	88	89	Manchester United	https://cdn.sofifa.com/teams/11/30.png	...	65.0	12.0	14.0	15.0	8.0	14.0	CAM	88.0	€206.9M	72.0
1	209658	L. Goretzka	26	https://cdn.sofifa.com/players/209/658/22_60.png	Germany	https://cdn.sofifa.com/flags/de.png	87	88	FC Bayern München	https://cdn.sofifa.com/teams/21/30.png	...	77.0	13.0	8.0	15.0	11.0	9.0	CM	87.0	€160.4M	74.0
2	176580	L. Suárez	34	https://cdn.sofifa.com/players/176/580/22_60.png	Uruguay	https://cdn.sofifa.com/flags/uy.png	88	88	Atlético de Madrid	https://cdn.sofifa.com/teams/240/30.png	...	38.0	27.0	25.0	31.0	33.0	37.0	ST	88.0	€91.2M	42.0
3	192985	K. De Bruyne	30	https://cdn.sofifa.com/players/192/985/22_60.png	Belgium	https://cdn.sofifa.com/flags/be.png	91	91	Manchester City	https://cdn.sofifa.com/teams/10/30.png	...	53.0	15.0	13.0	5.0	10.0	13.0	CM	91.0	€232.2M	68.0
4	224334	M. Acuña	29	https://cdn.sofifa.com/players/224/334/22_60.png	Argentina	https://cdn.sofifa.com/flags/ar.png	84	84	Sevilla FC	https://cdn.sofifa.com/teams/481/30.png	...	82.0	8.0	14.0	13.0	13.0	14.0	LB	84.0	€77.7M	80.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
16703	259718	F. Gebhardt	19	https://cdn.sofifa.com/players/259/718/22_60.png	Germany	https://cdn.sofifa.com/flags/de.png	52	66	FC Basel 1893	https://cdn.sofifa.com/teams/896/30.png	...	10.0	53.0	45.0	47.0	52.0	57.0	GK	52.0	€361K	6.0
16704	251433	B. Voll	20	https://cdn.sofifa.com/players/251/433/22_60.png	Germany	https://cdn.sofifa.com/flags/de.png	58	69	F.C. Hansa Rostock	https://cdn.sofifa.com/teams/27/30.png	...	10.0	59.0	60.0	56.0	55.0	61.0	GK	58.0	€656K	5.0
16706	262846	�. Dobre	20	https://cdn.sofifa.com/players/262/846/22_60.png	Romania	https://cdn.sofifa.com/flags/ro.png	53	63	FC Academica Clinceni	https://cdn.sofifa.com/teams/113391/30.png	...	12.0	57.0	52.0	53.0	48.0	58.0	GK	53.0	€279K	5.0
16707	241317	21 Xue Qinghao	19	https://cdn.sofifa.com/players/241/317/21_60.png	China PR	https://cdn.sofifa.com/flags/cn.png	47	60	Shanghai Shenhua FC	https://cdn.sofifa.com/teams/110955/30.png	...	9.0	49.0	48.0	45.0	38.0	52.0	GK	47.0	€223K	21.0
16708	259646	A. Shaikh	18	https://cdn.sofifa.com/players/259/646/22_60.png	India	https://cdn.sofifa.com/flags/in.png	47	67	ATK Mohun Bagan FC	https://cdn.sofifa.com/teams/113146/30.png	...	13.0	49.0	41.0	39.0	45.0	49.0	GK	47.0	€259K	7.0

	x	y	g1	course
0	0.0	0.0	100.000000	A
1	1.0	2.0	97.763932	A
2	4.0	3.0	92.763932	A
3	5.0	5.0	85.692864	A
4	5.0	5.0	78.621796	B
5	4.1	3.0	73.541442	B
6	1.0	0.5	72.423408	B
7	0.0	0.0	72.423408	B

	Best Position	mean(Value)	Highlight
0	CF	9.122222e+06	True
1	LW	6.443137e+06	True
2	CM	5.630414e+06	True
3	CAM	4.356162e+06	False
4	RW	3.977832e+06	False
5	CDM	3.539740e+06	False
6	LWB	3.451340e+06	False
7	LM	3.439977e+06	False
8	ST	3.295080e+06	False
9	RB	3.203283e+06	False
10	LB	3.051887e+06	False
11	CB	3.038834e+06	False
12	RWB	3.023522e+06	False
13	GK	2.703686e+06	False
14	RM	2.550153e+06	False

	season	STATE	player	COUNT
0	season1	WIN	A	7
1	season1	LOSE	A	3
2	season2	WIN	A	999999
3	season2	LOSE	A	1
4	season1	WIN	B	8
5	season1	LOSE	B	2
6	season2	WIN	B	4
7	season2	LOSE	B	0

1. 다음은 이미지와 대응하는 히스토그램을 나타낸것이다. 이미지와 히스토그램을 올바르게 짝지어라. (50점)

2. 주어진 자료를 바탕으로 예시와 같은 시각화를 구현하라. (5점)

3. 아래는 앤스콤의 플랏이다. 옳게 해석한 사람을 모두 고르라 (5점)

4. 다음은 아이스크림 소비량과 소아마비의 관계를 그린 산점도이다. 이때 색깔은 온도가 비슷한 관측치로 그룹핑되었다. 옳은 해석을 모두 골라라. (10점)

5. FIFA22 (100점)

(a) Loaned From,Marking 열을 선택하는 코드를 작성하고 값을 확인하라.

(b) 기존의 데이터프레임에서 Loaned From, Marking열을 제외하는 코드를 작성하라.

(c) (b)의 결과에 .dropna()를 사용하여 결측치를 제거하는 코드를 작성하라. 몇개의 결측치가 제거되었는가?

(d) (c)의 결과에 아래의 코드를 활용하여 Wage의 값을 적절하게 변환하라.

(e) 아래의 세부사항에 맞춰서 Best Position에 따른 시장가치(Value)의 평균을 barplot을 이용하여 시각화 하라.

(f) 아래의 세부사항에 맞추어 (Dribbling,SlidingTackle)의 산점도를 그려라.