Published December 12, 2023
| Version v1
Journal article
Open
Everyday language input and production in 1,001 children from six continents
Creators
- 1. Harvard University
- 2. University of Manitoba
- 3. Stockholm University
- 4. Max Planck Institute for Psycholinguistics
- 5. University of Connecticut
- 6. Purdue University
- 7. Stockhold University
- 8. Basque Center on Cognition Brain and Language
- 9. PSL University
- 10. University of Chicago
- 11. Ohio State University
- 12. Royal Dutch Kentalis Utrecht
Description
Language is a universal human ability, acquired readily by young children, who otherwise struggle with many basics of survival. And yet, language ability is variable across individuals. Naturalistic and experimental observations suggest that children's linguistic skills vary with factors like socioeconomic status and children's gender. But which factors really influence children's day-to-day language use? Here, we leverage speech technology in a big-data approach to report on a unique cross-cultural and diverse data set: >2,500 d-long, child-centered audio-recordings of 1,001 2- to 48-mo-olds from 12 countries spanning six continents across urban, farmer-forager, and subsistence-farming contexts. As expected, age and language-relevant clinical risks and diagnoses predicted how much speech (and speech-like vocalization) children produced. Critically, so too did adult talk in children's environments: Children who heard more talk from adults produced more speech. In contrast to previous conclusions based on more limited sampling methods and a different set of language proxies, socioeconomic status (operationalized as maternal education) was not significantly associated with children's productions over the first 4 y of life, and neither were gender or multilingualism. These findings from large-scale naturalistic data advance our understanding of which factors are robust predictors of variability in the speech behaviors of young learners in a wide range of everyday contexts.
Data availability
Anonymized (tabular) data and all relevant code have been deposited with the Open Science Foundation (https://osf.io/9v2m5/?viewonly=50df17fcf0844145ae692c35b78c6b08) (88). The raw audio recordings are not able to be shared given the consent process participants underwent, but all derived tabular data are fully shared.
Files
bergelson-et-al-2023-everyday-language-input-and-production-in-1-001-children-from-six-continents.pdf
Files
(31.3 MB)
| Name | Size | Download all |
|---|---|---|
|
Article md5:fa186169d66607fbfcc458aad108f575 |
29.7 MB | Preview Download |
|
Supporting information md5:98ab4d7113f82151678fd5d109f6284a |
1.6 MB | Preview Download |
Additional details
Identifiers
- DOI
- 10.1073/pnas.2300671120
- Other
- oai:uchicago.tind.io:10429
Funding
- MechELex
- ANR-16-DATA-0004 ACLEW
- MechELex
- ANR-14-CE30-0003
- McDonnell Foundation
- ExELang
- ERC H2020
- NEH
- HJ-253479-17
- NIH
- DP5-OD019812
- NSF
- BCS-1844710
- NSF
- SBE-0354453
- ESRC
- ES/L008955/1
- SSHRC
- 435-2015-0628
- SSHRC
- 869-2016-0003
- NSERC
- 501769-2016-RGPDD
- Netherlands Organisation for Scientific Research
- 275-89-033
- NIMH
- K23MH111955
- NIDCDD
- F31DC018219
- MAW
- 2011.0070
- MAW
- 2013.0056
- Basque Government
- BERC 2022-2025 program
- Spanish State Research Agency
- BCBL Severo Ochoa excellence accreditation
- Unknown funder
- Ramon y Cajal Fellowship
- Unknown funder
- ARC CE140100041