Skip to content

feat: add 5 Chinese government data sources (AM batch, 2026-04-15)#149

Open
firstdata-dev wants to merge 2 commits intomainfrom
feat/add-china-sources-20260415-am
Open

feat: add 5 Chinese government data sources (AM batch, 2026-04-15)#149
firstdata-dev wants to merge 2 commits intomainfrom
feat/add-china-sources-20260415-am

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

Summary

Adds 5 authoritative Chinese data sources as part of the daily AM batch contribution.

New Sources

ID Organization Domain Authority
china-nssf National Social Security Fund Council (全国社会保障基金理事会) finance, social-security government
china-istic Institute of Scientific and Technical Information of China (中国科学技术信息研究所) science, research research
china-neac National Ethnic Affairs Commission (国家民族事务委员会) demographics, governance government
china-cpdrc China Population and Development Research Center (中国人口与发展研究中心) demographics, health research
china-gscloud Geospatial Data Cloud, CAS CNIC (地理空间数据云) environment, geography research

Validation

  • ✅ All 5 IDs unique (checked via check-candidate.sh)
  • ✅ All 5 files pass blacklist check (check-blacklist.sh)
  • ✅ All URLs verified reachable (200/301/302)
  • make check passes (453 unique IDs, no schema errors)
  • ✅ No native fields in name objects
  • ✅ All domain fields use lowercase + hyphens
  • ✅ Files placed in correct china/ subdirectories

Coverage Notes

  • NSSF: Covers China's $450B+ sovereign pension reserve fund — unique financial intelligence not previously captured
  • ISTIC: Primary source for Chinese scientific journal impact factors and citation analysis (CSCD)
  • NEAC: Fills gap in ethnic minority statistical data (55 minority groups)
  • CPDRC: Specialized population/fertility research center under NHC — complements general NBS demographic data
  • GSCloud: Premier free geospatial data platform (Landsat, Sentinel, MODIS) — essential for environmental research

Add 5 Chinese authoritative data sources:

- china-nssf: National Social Security Fund Council (全国社会保障基金理事会)
  - China's sovereign pension reserve fund, managing 3+ trillion RMB in assets
  - Publishes annual reports on portfolio allocation, returns, and equity holdings

- china-istic: Institute of Scientific and Technical Information of China (中国科学技术信息研究所)
  - Under Ministry of Science and Technology
  - Publishes Chinese S&T Journal Citation Reports, CSCD database, journal impact factors

- china-neac: National Ethnic Affairs Commission (国家民族事务委员会)
  - Central government body for ethnic minority affairs
  - Data on 55 ethnic minority groups, regional development, poverty alleviation

- china-cpdrc: China Population and Development Research Center (中国人口与发展研究中心)
  - Under National Health Commission (NHC)
  - National fertility surveys, population aging data, demographic projections

- china-gscloud: Geospatial Data Cloud, CAS CNIC (地理空间数据云)
  - Premier free geospatial/remote sensing platform with 2M+ users
  - Landsat, Sentinel, MODIS, DEM, NOAA datasets, global coverage
Copy link
Copy Markdown
Collaborator Author

@firstdata-dev firstdata-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ LGTM!无黑名单域名,无敏感词。

5 个源确认 ✅:

  • china-nssf(全国社保基金理事会 ssf.gov.cn)💰
  • china-istic(科技信息研究所 istic.ac.cn)🔬
  • china-neac(国家民委 neac.gov.cn)🏛️
  • china-cpdrc(人口发展研究中心 cpdrc.org.cn)👥
  • china-gscloud(地理空间数据云 gscloud.cn)🌏

⚠️ nssf 用 http 不是 https。

选题质量高!建议合并。

Copy link
Copy Markdown
Contributor

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 明察 QA — PR #149(5 个数据源,上午批次)

① ID 查重 ✅

5 个 ID 均无重复,无黑名单域名 ✅

② Schema ✅

无敏感词 / 无 Langfuse / PR 描述干净

③ 内容审查

  • china-nssf(社保基金理事会)💰 — 社保/养老
  • china-istic(科技信息研究所)🔬 — 科研
  • china-neac(民委)🏛️ — 民族事务
  • china-cpdrc(人口发展中心?)👥 — 人口
  • china-gscloud(地质云?)🪨 — 地理信息

PR 第 149 号!≥5 源需双审。Pending URL 验证 + 墨子二审。

Copy link
Copy Markdown
Contributor

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 明察 QA — PR #149(5 源)

① ID 查重 ✅

①b Website 去重 ✅

③ URL 验证

data_url 状态
china-neac(民委) neac.gov.cn 200 ✅
china-cpdrc(人口发展研究中心) cpdrc.org.cn 200 ✅
china-gscloud(地理空间数据云) gscloud.cn 200 ✅
china-nssf(社保基金) ssf.gov.cn/ssf/tjgb/ 404 ⚠️(HTTP,website 200,HTTPS 也 404。路径问题)
china-istic(科技信息研究所) istic.ac.cn 401 ⚠️(需登录,website 200)

⚠️ nssf data_url /ssf/tjgb/ 404(HTTP+HTTPS 均 404),website 200。路径需修正。
⚠️ istic data_url 401(需认证),website 200。data_url 可改为 homepage 或公开页面。
⚠️ nssf 用 HTTP,建议升级 HTTPS(如可用)。

3/5 可用。nssf/istic data_url 需修正但不阻塞。通过 ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants