The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Data Codebook

This codebook documents all seven built-in datasets in assemblykor. For each dataset, we list every variable with its type, missing rate, and value distribution. All datasets can be joined via member_id and/or assembly.


legislators

947 rows, 15 variables. MP metadata for the 20th-22nd Korean National Assembly.

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
member_id character 0.0% 661 unique; top: 04T3751T, 0VU8517U, 1WE5693J
assembly numeric 0.0% min=20, Q1=20, median=21, Q3=22, max=22
name character 0.0% 653 unique; top: 강훈식, 권성동, 권칠승
name_hanja character 0.0% 660 unique; top: 南仁順, 姜勳植, 孟聖奎
name_eng character 0.7% 652 unique; top: AHN CHEOLSOO, AHN GYUBACK, AN HOYOUNG
party character 0.0% 17 unique; top: 더불어민주당, 국민의힘, 자유한국당
party_elected character 0.0% 20 unique; top: 더불어민주당, 새누리당, 국민의힘
district character 0.0% 299 unique; top: 비례대표, 강원 원주시갑, 경기 성남시분당구갑
district_type character 0.0% 2 unique; top: constituency, proportional
committees character 0.0% 335 unique; top: , 외교통일위원회, 국토교통위원회
gender character 0.0% 2 unique; top: M, F
birth_date Date 0.0% 1940-07-11 to 1995-01-02
seniority numeric 0.0% min=1, Q1=1, median=2, Q3=3, max=8
n_bills numeric 0.0% min=3, Q1=432, median=702, Q3=1098, max=4198
n_bills_lead numeric 0.0% min=0, Q1=34, median=56, Q3=83, max=696

bills

60,925 rows, 9 variables. Legislative bill metadata (20th-22nd assembly).

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
bill_id character 0.0% 60925 unique; top: PRC_A1A6B0A8G3D0P1G8B4B2F1J6J3I8Q3, PRC_A1A6B0W6I2T0P1H6L0D4U0A9N0G3B2, PRC_A1A6B0X7D2X8K1H6B5W3G0C3F5P5L7
bill_no numeric 0.0% min=2000001, Q1=2017465, median=2109713, Q3=2200455, max=2217175
assembly numeric 0.0% min=20, Q1=20, median=21, Q3=22, max=22
bill_name character 0.0% 4530 unique; top: 조세특례제한법 일부개정법률안, 공직선거법 일부개정법률안, 국회법 일부개정법률안
committee character 0.2% 33 unique; top: 행정안전위원회, 보건복지위원회, 국토교통위원회
propose_date Date 0.0% 2016-05-30 to 2026-02-27
result character 20.0% 8 unique; top: 임기만료폐기, 대안반영폐기, 수정가결
proposer character 0.0% 770 unique; top: 황주홍, 민형배, 윤준병
proposer_id character 0.0% 778 unique; top: JOY4394O, VRY5522V, JC14718Q

wealth

2,928 rows, 14 variables. Legislator asset declaration panel (2015-2025, 13 disclosure periods).

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
member_id character 0.0% 772 unique; top: 04T3751T, 1WE5693J, 1Y73132H
year numeric 0.0% min=2015, Q1=2017, median=2020, Q3=2022, max=2024
name character 0.0% 771 unique; top: 권성동, 김도읍, 김상훈
total_assets numeric 0.0% min=36960, Q1=1074921, median=1814372, Q3=3297303, max=443526250
total_debt numeric 0.0% min=0, Q1=44580, median=220526, Q3=572624, max=20027140
net_worth numeric 0.0% min=-1427653, Q1=798320, median=1472084, Q3=2786012, max=443526250
real_estate numeric 0.0% min=0, Q1=566574, median=1009008, Q3=1926902, max=42900041
building numeric 0.0% min=0, Q1=484864, median=903348, Q3=1687250, max=42886438
land numeric 0.0% min=0, Q1=0, median=5462, Q3=139536, max=25616148
deposits numeric 0.0% min=5827, Q1=218059, median=420716, Q3=854379, max=46929336
stocks numeric 0.0% min=0, Q1=0, median=0, Q3=25910, max=375332731
n_properties numeric 0.0% min=0, Q1=3, median=4, Q3=5, max=37
has_seoul_property logical 0.0% TRUE: 2299, FALSE: 629
has_gangnam_property logical 0.0% TRUE: 809, FALSE: 2119

seminars

5,962 rows, 18 variables. Legislator-year policy seminar activity (17th-22nd assembly, 2000-2025).

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
name character 0.0% 1081 unique; top: 안민석, 조정식, 변재일
member_id character 4.5% 1088 unique; top: IN328264, XBT9550Q, ZA54991S
year numeric 0.0% min=2004, Q1=2011, median=2016, Q3=2021, max=2025
assembly numeric 0.0% min=17, Q1=18, median=20, Q3=21, max=22
party character 0.2% 42 unique; top: 더불어민주당, 한나라당, 새누리당
camp character 0.0% 5 unique; top: 민주계, 보수계, 기타
seniority numeric 4.1% min=1, Q1=1, median=1, Q3=2, max=6
n_seminars numeric 0.0% min=1, Q1=2, median=5, Q3=12, max=94
n_cross_party numeric 0.0% min=0, Q1=0, median=1, Q3=4, max=55
cross_party_ratio numeric 0.0% min=0, Q1=0, median=0, Q3=0, max=1
avg_coalition_size numeric 0.0% min=1, Q1=2, median=3, Q3=10, max=74
is_governing logical 0.0% TRUE: 2795, FALSE: 3167
is_female logical 4.1% TRUE: 1015, FALSE: 4703
is_proportional logical 4.1% TRUE: 1370, FALSE: 4348
is_seoul logical 4.1% TRUE: 805, FALSE: 4913
province character 4.2% 35 unique; top: 비례대, 경기, 서울
total_terms numeric 4.1% min=1, Q1=1, median=2, Q3=3, max=5
n_bills_led numeric 0.0% min=0, Q1=21, median=42, Q3=72, max=696

speeches

15,843 rows, 9 variables. Committee speech records from the Science and ICT Committee (22nd assembly, 2024).

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
assembly numeric 0.0% min=22, Q1=22, median=22, Q3=22, max=22
date Date 0.0% 2024-06-11 to 2024-12-27
committee character 0.0% 1 unique; top: 과학기술정보방송통신위원회
speaker character 0.0% 165 unique; top: 위원장 최민희, 김현 위원, 노종면 위원
role character 0.0% 14 unique; top: legislator, chair, witness
speaker_name character 0.0% 139 unique; top: 최민희, 김현, 노종면
member_id character 0.0% 24 unique; top: nan, 6247, 1728
speech_order numeric 0.0% min=1, Q1=330, median=775, Q3=1468, max=4091
speech character 0.0% 15795 unique; top: 1분만 더 쓰겠습니다. 추가질의 안 하겠습니다. 그러면 그 반대의 경우에 대해서 무죄 판결을 받았다든지 이런 분들에 대해서는 혹시 사과하신 적 있습니까?, PPT 하나만 띄워 주시지요. (영상자료를 보며) 자, 계속 일부 표현의 문제라고 지적하시는 이 부분입니다. 대개 40분에서 50분 정도의 프로그램을 제작을 하면 핵심적인 화면이라고 하는 것은 10분 내외, 그것도 구하지 못해서 사실 거의 대부분 자료화면으로 프로그램을 만들 때도 있는데 이 화면이 거의 1분 전후로 나갔던 것으로 제가 기억을 하고, 저는 지금도 이 화면이 정말 생생합니다. 대법원에서 허위보도라고 했습니다. 그다음요. 이분, 인간 광우병이라고 MBC PD수첩에서 주장한 이분, 아마 지금 이 자리에 앉아 계신 모든 분들이 저분의 인터뷰 기억하실 겁니다. 이것 역시 허위보도라고 했습니다. 하나 더 보여 주시지요. 한국인이 MM형 유전자기 때문에 광우병에 걸린 소를 먹으면 광우병에 걸릴 확률이 94.3%다. 이 세 가지 핵심적인, 그 당시 PD수첩의 광우병 보도를 구성하는 이 세 가지 핵심적인 보도를 대법원이 다 허위라고 판시를 했는데 이것을 일부 표현의 잘못이라고 주장하시는 겁니까?, 감사원 출신이니까 혹시 그 부분에 대해서 내부적으로 이게 어떻게 된 일인가, 방통위 직원들이 어떻게 된 일인가 뭐 알아보신 것 있습니까?

votes

7,997 rows, 13 variables. Plenary vote tallies (20th-22nd assembly).

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
bill_id character 0.0% 8049 unique; top: ARC_D1U6W0S6P2G7T1G1C2X3M1S6L7N2N0, ARC_A1D6N0F9G0E9M1T7F4B8E3C0T6E9E3, ARC_A1E8A1H2I2X1C1Q7Q4W7A0G6F1S9Q2
bill_no character 0.0% 8031 unique; top: 2022996, 2000491, 2012299
bill_name character 0.0% 5752 unique; top: 도로교통법 일부개정법률안(대안)(행정안전위원장), 자동차관리법 일부개정법률안(대안)(국토교통위원장), 국민건강보험법 일부개정법률안(대안)(보건복지위원장)
assembly numeric 0.0% min=20, Q1=20, median=21, Q3=21, max=22
committee character 0.0% 47 unique; top: 농림축산식품해양수산위원회, 국토교통위원회, 보건복지위원회
vote_date Date 0.0% 2016-06-09 to 2026-03-12
result character 0.0% 3 unique; top: 원안가결, 수정가결, 부결
bill_type character 0.0% 9 unique; top: 법률안, 예산안, 결의안
total_members numeric 0.0% min=288, Q1=296, median=299, Q3=300, max=300
voted numeric 0.0% min=3, Q1=188, median=213, Q3=238, max=297
yes numeric 0.0% min=1, Q1=180, median=204, Q3=229, max=297
no numeric 0.0% min=0, Q1=0, median=0, Q3=1, max=187
abstain numeric 0.0% min=0, Q1=1, median=3, Q3=7, max=64

roll_calls

368,210 rows, 8 variables. Member-level roll call votes (22nd assembly, 1,233 bills).

#> Warning in attr(x, "align"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
#> Warning in attr(x, "format"): 'xfun::attr()' is deprecated.
#> Use 'xfun::attr2()' instead.
#> See help("Deprecated")
Variable Type Missing Distribution
bill_id character 0.0% 1286 unique; top: ARC_B2U4U0H7N2J2O0N8R5I5F1J9G2D2U5, ARC_C2A4B0A8H3R0V0R9F2L0E3Z7P8F7S5, ARC_C2A4M0W7H2E2U0L8Z5W5D3P2U7T2A3
assembly numeric 0.0% min=22, Q1=22, median=22, Q3=22, max=22
member_name character 0.0% 304 unique; top: 강경숙, 강대식, 강득구
member_id character 0.0% 304 unique; top: 04T3751T, 0698755I, 0R68099X
party character 0.0% 8 unique; top: 더불어민주당, 국민의힘, 조국혁신당
district character 0.0% 255 unique; top: 비례대표, 강원 강릉시, 강원 동해시태백시삼척시정선군
vote character 0.0% 4 unique; top: 찬성, 불참, 반대
vote_date Date 0.0% 2024-07-04 to 2026-03-12

Dataset relationship diagram

                    legislators
                   (member_id + assembly)
                   /       |        \
                  /        |         \
               wealth   seminars   bills
            (member_id) (member_id) (proposer_id)
                                      |
                                    votes
                                  (bill_id)
                                      |
                                  roll_calls
                               (bill_id + member_id)
                                      |
                                  legislators
                                  (member_id)

    speeches --- legislators (member_id, 22nd assembly only)

All datasets share member_id as the primary join key. Use assembly as a secondary key when joining datasets that span multiple assembly terms.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.