파이온 - Python Online Learning

import pandas as pd # 월별 CSV 파일 생성 df_jan = pd.DataFrame({'월': ['1월', '1월'], '매출': [100, 150]}) df_feb = pd.DataFrame({'월': ['2월', '2월'], '매출': [200, 180]}) df_jan.to_csv('/tmp/jan.csv', index=False) df_feb.to_csv('/tmp/feb.csv', index=False) # 각각 읽어오기 jan = pd.read_csv('/tmp/jan.csv') feb = pd.read_csv('/tmp/feb.csv') # 두 데이터프레임을 위아래로 합칩니다. all_data = pd.([jan, feb], ignore_index=True) print(all_data) print("\n전체 행 수:", len(all_data))

pd.concat()으로 데이터프레임 합치기

여러 파일에 나눠진 데이터를 하나로 합쳐야 할 때 pd.concat()을 사용합니다.

기본 문법

합친결과 = pd.concat([df1, df2, df3], ignore_index=True)

동작 과정

jan (1월):        feb (2월):
    월   매출         월   매출
0  1월  100      0  2월  200
1  1월  150      1  2월  180

         ↓ pd.concat([jan, feb])

all_data (합친 결과):
    월   매출
0  1월  100
1  1월  150
2  2월  200   ← feb 데이터가 아래에 추가됨
3  2월  180

ignore_index 옵션

# ignore_index=False (기본값): 원래 인덱스 유지
→ 0, 1, 0, 1  (중복 발생!)

# ignore_index=True: 새 인덱스 부여
→ 0, 1, 2, 3  (깔끔!)

세로 합치기 vs 가로 합치기

# 세로 합치기 (위아래, 행 추가) - 기본값
pd.concat([df1, df2])  # axis=0

# 가로 합치기 (좌우, 열 추가)
pd.concat([df1, df2], axis=1)

실전: 여러 파일 한 번에 합치기

import glob

# 특정 폴더의 모든 CSV 읽어서 합치기
files = glob.glob("data/*.csv")
df_list = [pd.read_csv(f) for f in files]
all_data = pd.concat(df_list, ignore_index=True)

concat() vs merge() 비교

비교	concat()	merge()
방식	단순 연결 (위아래/좌우)	공통 열 기준 결합
용도	같은 구조의 데이터 합치기	다른 테이블 조인
SQL 비유	UNION	JOIN

💡 핵심: 같은 구조의 파일 여러 개를 합칠 때는 pd.concat()을 사용합니다.

여러 CSV 파일 합치기 : pd.concat()