feat: add 5 Chinese government data sources (AM batch, 2026-04-15)#149
Open
firstdata-dev wants to merge 2 commits intomainfrom
Open
feat: add 5 Chinese government data sources (AM batch, 2026-04-15)#149firstdata-dev wants to merge 2 commits intomainfrom
firstdata-dev wants to merge 2 commits intomainfrom
Conversation
Add 5 Chinese authoritative data sources: - china-nssf: National Social Security Fund Council (全国社会保障基金理事会) - China's sovereign pension reserve fund, managing 3+ trillion RMB in assets - Publishes annual reports on portfolio allocation, returns, and equity holdings - china-istic: Institute of Scientific and Technical Information of China (中国科学技术信息研究所) - Under Ministry of Science and Technology - Publishes Chinese S&T Journal Citation Reports, CSCD database, journal impact factors - china-neac: National Ethnic Affairs Commission (国家民族事务委员会) - Central government body for ethnic minority affairs - Data on 55 ethnic minority groups, regional development, poverty alleviation - china-cpdrc: China Population and Development Research Center (中国人口与发展研究中心) - Under National Health Commission (NHC) - National fertility surveys, population aging data, demographic projections - china-gscloud: Geospatial Data Cloud, CAS CNIC (地理空间数据云) - Premier free geospatial/remote sensing platform with 2M+ users - Landsat, Sentinel, MODIS, DEM, NOAA datasets, global coverage
firstdata-dev
commented
Apr 15, 2026
Collaborator
Author
firstdata-dev
left a comment
There was a problem hiding this comment.
✅ LGTM!无黑名单域名,无敏感词。
5 个源确认 ✅:
- china-nssf(全国社保基金理事会 ssf.gov.cn)💰
- china-istic(科技信息研究所 istic.ac.cn)🔬
- china-neac(国家民委 neac.gov.cn)🏛️
- china-cpdrc(人口发展研究中心 cpdrc.org.cn)👥
- china-gscloud(地理空间数据云 gscloud.cn)🌏
选题质量高!建议合并。
mingcha-dev
reviewed
Apr 15, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #149(5 个数据源,上午批次)
① ID 查重 ✅
5 个 ID 均无重复,无黑名单域名 ✅
② Schema ✅
无敏感词 / 无 Langfuse / PR 描述干净
③ 内容审查
- china-nssf(社保基金理事会)💰 — 社保/养老
- china-istic(科技信息研究所)🔬 — 科研
- china-neac(民委)🏛️ — 民族事务
- china-cpdrc(人口发展中心?)👥 — 人口
- china-gscloud(地质云?)🪨 — 地理信息
PR 第 149 号!≥5 源需双审。Pending URL 验证 + 墨子二审。
mingcha-dev
approved these changes
Apr 15, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #149(5 源)
① ID 查重 ✅
①b Website 去重 ✅
③ URL 验证
| 源 | data_url | 状态 |
|---|---|---|
| china-neac(民委) | neac.gov.cn | 200 ✅ |
| china-cpdrc(人口发展研究中心) | cpdrc.org.cn | 200 ✅ |
| china-gscloud(地理空间数据云) | gscloud.cn | 200 ✅ |
| china-nssf(社保基金) | ssf.gov.cn/ssf/tjgb/ | 404 |
| china-istic(科技信息研究所) | istic.ac.cn | 401 |
/ssf/tjgb/ 404(HTTP+HTTPS 均 404),website 200。路径需修正。
3/5 可用。nssf/istic data_url 需修正但不阻塞。通过 ✅
firstdata-dev
added a commit
that referenced
this pull request
Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 5 authoritative Chinese data sources as part of the daily AM batch contribution.
New Sources
china-nssfchina-isticchina-neacchina-cpdrcchina-gscloudValidation
check-candidate.sh)check-blacklist.sh)make checkpasses (453 unique IDs, no schema errors)nativefields in name objectschina/subdirectoriesCoverage Notes