Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder characterized by restricted, repetitive behavior and impaired social communication and interaction. However, significant challenges remain in diagnosing and subtyping ASD due partly to the lack of a validated, standardized vocabulary to characterize clinical phenotypic ASD presentation. Although the human phenotype ontology (HPO) was useful for defining nuanced phenotypes for rare genetic diseases, it was insufficient for capturing behavioral and psychiatric phenotypes in people with ASD. As a result, there was a clear need for a well-established phenotype terminology set that can aid in characterizing ASD phenotypes from clinical narratives. To address the issue, they used natural language processing (NLP) techniques to identify and curate ASD phenotypic terms from high-quality unstructured clinical notes in an electronic health record (EHR) on 8,499 people with ASD, 8,177 people with non-ASD psychiatric disorders , and 8,482 people who did not have a documented psychiatric disorder. They then used the nonnegative matrix factorization method to perform dimensional reduction clustering analysis on individuals with ASD. Investigators identified 3,336 ASD terms linking to 1,943 unique medical concepts using a note-processing pipeline that included several steps of cutting-edge NLP approaches, making it one of the largest ASD terminologies sets to date. The extracted ASD terms were organized further in a formal ontology structure similar to the HPO. Clustering analysis revealed that these terms could be used in a diagnostic pipeline to distinguish people with ASD from people with other psychiatric disorders. Study group ASD phenotype ontology can help clinicians characterize people with ASD, automated diagnosis, and subtype people with ASD for personalized therapeutic decision-making.