@@ -62,9 +62,19 @@ We've made contributing easier! You can now edit simple JSON files organized by
6262** Important for Contributors:**
6363- ✅ ** DO** : Edit JSON files in ` contributions/ ` directory
6464- ❌ ** DON'T** : Edit SQL, CSV, XML, YAML, or other export files (auto-generated)
65+ - ❌ ** DON'T** : Edit GeoJSON or TOON format files (auto-generated from database)
6566- ❌ ** DON'T** : Run build scripts or exports locally (GitHub Actions handles this)
6667- 🔒 ** MySQL workflow** : Reserved for repository maintainers only
6768
69+ ### Understanding Export Formats
70+
71+ All data you contribute via JSON is automatically exported to ** 11 different formats** :
72+ - ** Core Formats** : JSON, MySQL, PostgreSQL, SQLite, SQL Server, MongoDB, XML, YAML, CSV
73+ - ** Geographic Format** : GeoJSON (RFC 7946 standard for mapping applications)
74+ - ** AI-Optimized Format** : TOON (Token-Oriented Object Notation - reduces LLM token usage by ~ 40%)
75+
76+ You don't need to worry about these formats - they're automatically generated from the MySQL database!
77+
6878## Glance at Table Structure
6979
7080### regions.sql
@@ -128,15 +138,17 @@ We've made contributing easier! You can now edit simple JSON files organized by
128138| ----------------- | --------------- | -------------- | -------------- |
129139| ` id ` | integer | Unique ID - omit for new states (auto-assigned) | Auto |
130140| ` name ` | string | The official name of the state. Use WikiData or Wikipedia or some other legitimate source. | required |
141+ | ` state_code ` | string | State/province code (e.g., "CA" for California) | required |
131142| ` country_id ` | integer | Unique id of parent country from ` countries.sql ` | required |
132143| ` country_code ` | string | ISO2 code of the parent country | required |
133144| ` fips_code ` | string | ISO-3166-2 subdivision code for the state |
134145| ` iso2 ` | string | ISO2 code of the parent state |
135146| ` iso3166_2 ` | string | ISO 3166-2 subdivision code |
136- | ` type ` | string | Type of state (province, state, etc.) |
147+ | ` type ` | string | Type of state (province, state, region, etc.) |
137148| ` level ` | integer | Administrative level of the subdivision |
138149| ` parent_id ` | integer | ID of parent administrative division |
139150| ` native ` | string | Native name of the state |
151+ | ` population ` | integer | Population of the state - [ Wikipedia] ( https://en.wikipedia.org/wiki/List_of_states_by_population ) |
140152| ` latitude ` | decimal | Latitude coordinates |
141153| ` longitude ` | decimal | Longitude coordinates |
142154| ` timezone ` | string | IANA timezone identifier (e.g., America/New_York) |
@@ -158,9 +170,146 @@ We've made contributing easier! You can now edit simple JSON files organized by
158170| ` latitude ` | decimal | Latitude coordinates | required
159171| ` longitude ` | decimal | Longitude coordinates | required
160172| ` native ` | string | Native name of the city |
161- | ` timezone ` | string | IANA timezone identifier (e.g., America/New_York) |
173+ | ` population ` | integer | Population of the city - [ Wikipedia] ( https://en.wikipedia.org/wiki/List_of_cities_by_population ) |
174+ | ` type ` | string | Type of settlement (city, town, village, etc.) |
175+ | ` level ` | integer | Administrative level |
176+ | ` parent_id ` | integer | ID of parent administrative division |
177+ | ` timezone ` | string | IANA timezone identifier (e.g., America/New_York) - ** REQUIRED for all cities** |
162178| ` translations ` | text | JSON object with name translations |
163179| ` created_at ` | timestamp | Optional - Creation timestamp (ISO 8601 format). If omitted, database uses default value. |
164180| ` updated_at ` | timestamp | Optional - Last update timestamp (ISO 8601 format). If omitted, database auto-updates. |
165181| ` flag ` | boolean | Optional - Auto-managed by system, defaults to 1. Contributors can omit this field. |
166182| ` wikiDataId ` | string | The unique ID from wikiData.org |
183+
184+ ## Data Quality Guidelines
185+
186+ ### Required Data Standards
187+
188+ #### Timezone Information (Critical!)
189+ - ** 100% of cities MUST have valid IANA timezone identifiers**
190+ - Use tools like [ TimeZoneDB] ( https://timezonedb.com/ ) or [ GeoNames] ( https://www.geonames.org/ ) to find correct timezones
191+ - Format: ` Continent/City ` (e.g., ` America/New_York ` , ` Europe/London ` , ` Asia/Tokyo ` )
192+ - ** Why it matters** : This database maintains 100% timezone coverage - don't break it!
193+
194+ #### Coordinates Accuracy
195+ - Use precise decimal coordinates (minimum 5 decimal places recommended)
196+ - Verify coordinates using Google Maps, OpenStreetMap, or official sources
197+ - Format:
198+ - Latitude: -90 to +90 (negative = South, positive = North)
199+ - Longitude: -180 to +180 (negative = West, positive = East)
200+
201+ #### Naming Conventions
202+ - Use official, commonly recognized names in English
203+ - Add native names in the ` native ` field
204+ - Use proper capitalization (e.g., "New York" not "new york")
205+ - Avoid abbreviations unless officially used (e.g., "St." in "St. Louis" is acceptable)
206+
207+ ### Data Sources
208+
209+ ** Recommended Sources (in priority order):**
210+ 1 . ** Official Government Websites** - Most authoritative
211+ 2 . ** WikiData** ([ wikidata.org] ( https://www.wikidata.org/ ) ) - Structured, multilingual data
212+ 3 . ** Wikipedia** - Well-sourced, community-verified
213+ 4 . ** GeoNames** ([ geonames.org] ( https://www.geonames.org/ ) ) - Comprehensive geographic database
214+ 5 . ** OpenStreetMap** - Community-maintained geographic data
215+
216+ ** Always include source in your PR description!**
217+
218+ ### Common Mistakes to Avoid
219+
220+ ❌ ** Don't Do This:**
221+ - Adding cities without timezone information
222+ - Using approximate coordinates (e.g., country center for city location)
223+ - Copying data without verification
224+ - Adding duplicate entries (check first!)
225+ - Using non-standard timezone names (e.g., "PST" instead of "America/Los_Angeles")
226+
227+ ✅ ** Do This Instead:**
228+ - Research proper IANA timezone for each city
229+ - Use precise coordinates for the city center or main landmark
230+ - Verify data from multiple reliable sources
231+ - Search existing data before adding new entries
232+ - Use official IANA timezone database format
233+
234+ ### Population Data (Optional but Recommended)
235+
236+ When adding population data:
237+ - Use recent census data or official estimates
238+ - Include source year if possible in PR description
239+ - Round to reasonable precision (avoid false precision)
240+ - ** Format** : Integer (e.g., ` 1000000 ` not ` "1,000,000" ` )
241+
242+ ### How to Find Foreign Keys
243+
244+ ** Finding State IDs:**
245+ ``` bash
246+ # Search in contributions/states/states.json
247+ grep -A 5 ' "name": "California"' contributions/states/states.json
248+ ```
249+
250+ ** Finding Country IDs:**
251+ ``` bash
252+ # Search in contributions/countries/countries.json
253+ grep -A 5 ' "name": "United States"' contributions/countries/countries.json
254+ ```
255+
256+ Or use the [ CSC Update Tool] ( https://manager.countrystatecity.in/ ) which automatically looks up IDs for you!
257+
258+ ## Pull Request Guidelines
259+
260+ ### Before Submitting
261+
262+ - [ ] Data verified from authoritative sources
263+ - [ ] Timezone validated using IANA timezone database
264+ - [ ] Coordinates checked on a map
265+ - [ ] No duplicate entries
266+ - [ ] Source included in PR description
267+ - [ ] Only JSON files in ` contributions/ ` edited
268+
269+ ### PR Description Template
270+
271+ ``` markdown
272+ ## Summary
273+ [Brief description of changes]
274+
275+ ## Type of Change
276+ - [ ] New city/state/country
277+ - [ ] Update existing data
278+ - [ ] Fix incorrect data
279+ - [ ] Add missing fields
280+
281+ ## Data Sources
282+ - Source 1: [URL]
283+ - Source 2: [URL]
284+
285+ ## Checklist
286+ - [ ] Timezones verified
287+ - [ ] Coordinates verified
288+ - [ ] Data sources cited
289+ - [ ] No duplicate entries
290+ ```
291+
292+ ### Review Process
293+
294+ 1 . ** Automated Checks** : GitHub Actions validates JSON format
295+ 2 . ** Data Import** : Your changes are imported to MySQL
296+ 3 . ** Export Generation** : All 11 formats regenerated
297+ 4 . ** Maintainer Review** : Human review of data quality and sources
298+ 5 . ** Merge** : Changes go live in next release!
299+
300+ ## Need Help?
301+
302+ ### Tools & Resources
303+ - ** [ CSC Update Tool] ( https://manager.countrystatecity.in/ ) ** - Easiest way to contribute (GUI)
304+ - ** [ API Documentation] ( https://docs.countrystatecity.in/ ) ** - Explore existing data
305+ - ** [ Demo Database] ( https://demo.countrystatecity.in/ ) ** - Browse online
306+ - ** [ IANA Timezone Database] ( https://www.iana.org/time-zones ) ** - Official timezone reference
307+
308+ ### Questions?
309+ - Open a [ GitHub Discussion] ( https://github.com/dr5hn/countries-states-cities-database/discussions )
310+ - Check existing [ Issues] ( https://github.com/dr5hn/countries-states-cities-database/issues )
311+ - Review [ contributions/README.md] ( ../contributions/README.md ) for detailed examples
312+
313+ ## Recognition
314+
315+ All contributors are recognized in our [ README] ( ../README.md ) and commit history. Thank you for helping maintain the most comprehensive open geographical database! 🌍
0 commit comments