# LSA.303 Introduction to Computational Linguistics

Parsing SLP Chapter 13 Outline Parsing with CFGs Bottom-up, top-down CKY parsing Mention of Earley and chart parsing 02/22/20 Speech and Language Processing - Jurafsky and

2 Parsing Parsing with CFGs refers to the task of assigning trees to input strings Trees that covers all and only the elements of the input and has an S at the top This chapter: find all possible trees Next chapter (14): choose the most probable one

02/22/20 Speech and Language Processing - Jurafsky and 3 Parsing parsing involves a search 02/22/20 Speech and Language Processing - Jurafsky and

4 Top-Down Search Were trying to find trees rooted with an S; start with the rules that give us an S. Then we can work our way down from there to the words. 02/22/20 Speech and Language Processing - Jurafsky and

5 Top Down Space 02/22/20 Speech and Language Processing - Jurafsky and 6 Bottom-Up Parsing We also want trees that cover the input words.

Start with trees that link up with the words Then work your way up from there to larger and larger trees. 02/22/20 Speech and Language Processing - Jurafsky and 7 Bottom-Up Space

8 Top-Down and Bottom-Up Top-down Only searches for trees that can be Ss But also suggests trees that are not consistent with any of the words Bottom-up Only forms trees consistent with the words But suggests trees that make no sense globally

02/22/20 Speech and Language Processing - Jurafsky and 9 Control Which node to try to expand next Which grammar rule to use to expand a node One approach: exhaustive search of the space of possibilities

Not feasible Time is exponential in the number of nonterminals LOTS of repeated work, as the same constituent is created over and over (shared sub-problems) 02/22/20 Speech and Language Processing - Jurafsky and 10 Dynamic Programming DP search methods fill tables with partial

results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time (well, no not really well return to this point) Efficiently store ambiguous structures with shared sub-parts. Well cover two approaches that roughly correspond to bottom-up and top-down approaches. CKY Earley we will mention this, not cover it in detail

02/22/20 Speech and Language Processing - Jurafsky and 11 CKY Parsing Consider the rule A BC If there is an A somewhere in the input then there must be a B followed by a C in the input. If the A spans from i to j in the input

then there must be some k st. i

Convert it to binary any arbitrary CFG can be rewritten into ChomskyNormal Form automatically. The resulting grammar accepts (and rejects) the same set of strings as the original grammar. But the resulting derivations (trees) are different. We saw this in the last set of lecture notes 02/22/20 Speech and Language Processing - Jurafsky and

13 Convert Grammar to CNF More specifically, we want our rules to be of the form ABC Or A w That is, rules can expand to either 2 nonterminals or to a single terminal. 02/22/20 Speech and Language Processing - Jurafsky and

14 Binarization Intuition Introduce new intermediate nonterminals into the grammar that distribute rules with length > 2 over several rules. So S A B C turns into S X C and XAB Where X is a symbol that doesnt occur anywhere else in the the grammar.

02/22/20 Speech and Language Processing - Jurafsky and 15 Converting grammar to CNF 1. Copy all conforming rules to the new grammar unchanged 2. Convert terminals within rules to dummy non-terminals 3. Convert unit productions 4. Make all rules with NTs on the right binary

In lecture: what these mean; apply to example on next two slides 02/22/20 16 Sample L1 Grammar 02/22/20 Speech and Language Processing - Jurafsky and

17 CNF Conversion 02/22/20 Speech and Language Processing - Jurafsky and 18 CKY Build a table so that an A spanning from i to j in the input is placed in

cell [i,j] in the table. E.g., a non-terminal spanning an entire string will sit in cell [0, n] Hopefully an S 02/22/20 Speech and Language Processing - Jurafsky and 19 CKY If

there is an A spanning i,j in the input A B C is a rule in the grammar Then There must be a B in [i,k] and a C in [k,j] for some i

CKY The loops to fill the table a column at a time, from left to right, bottom to top. When were filling a cell, the parts needed to fill it are already in the table to the left and below 02/22/20 Speech and Language Processing - Jurafsky and 21

CKY Algorithm 02/22/20 Speech and Language Processing - Jurafsky and 22 Example Go through full example in lecture 02/22/20

Speech and Language Processing - Jurafsky and 23 CKY Parsing Is that really a parser? So, far it is only a recognizer Success? an S in cell [0,N] To turn it into a parser see Lecture 02/22/20

Speech and Language Processing - Jurafsky and 24 CKY Notes Since its bottom up, CKY populates the table with a lot of worthless constituents. To avoid this we can switch to a topdown control strategy Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis.

02/22/20 Speech and Language Processing - Jurafsky and 25 Dynamic Programming Parsing Methods CKY (Cocke-Kasami-Younger) algorithm based on bottom-up parsing and requires first normalizing the grammar. Earley parser is based on top-down

parsing and does not require normalizing grammar but is more complex. More generally, chart parsers retain completed phrases in a chart and can combine top-down and bottom-up search. 26 Conclusions Syntax parse trees specify the syntactic structure of a sentence that helps determine its meaning. John ate the spaghetti with meatballs with

chopsticks. How did John eat the spaghetti? What did John eat? CFGs can be used to define the grammar of a natural language. Dynamic programming algorithms allow computing a single parse tree in cubic time or all parse trees in exponential time. 27

## Recently Viewed Presentations

• Not later than 6 months before the end of the Term, Holden will notify the Dealer of its intention either to offer the Dealer a further term of appointment and the period of, and any conditions (including, but not limited...
• Racial Equity Resources. Racetolead.org - Building Movement Project. Leading With Intent - BoardSource. NonprofitAF - Vu Le, Rainier Valley Corps. National Council of Nonprofits . Race Forward. Inclusive Workplace Toolkit - Gay & Lesbian Fund for Colorado. Achieving Diversity In...
• COMM 2011 Chapter 10 Problem-Solving in Groups © 2011 Cengage Learning © 2011 Cengage Learning © 2011 Cengage Learning © 2011 Cengage Learning © 2011 Cengage ...
• The Late Passenger The sky was low, the sounding rain was falling dense and dark, And Noah's sons were standing at the window of the Ark.The beasts were in, but Japhet said, 'I see one creature more Belated and unmated...
• FLQ Kidnapped James Cross, a British Diplomat , from his Montreal home. If demands were not met, Cross would be executed: \$500 000 randsom. TV/Radio time to broadcast their views. Safe passage out of Canada. FLQ hoped this kidnapping would...
• CANDIDATE TIME PERIODS FOR PROPAGATION STUDIES WITH STEREO-A/B, ACE, WIND, SOHO B. Klecker Max-Planck-Institut für extraterrestrische Physik
• There is no homework assigned. Principle 8: Some learning takes place naturally as we sleep. Students will naturally work on the day's lesson. ~Review and Questions~ Techniques and Materials~ -Sound-color chart: The color chart can draw students' attention. Color chart...
• SAT. Reading Sample Questions. The following passage is adapted from the first inaugural address of President Franklin D. Roosevelt. The speech was given in 1933 as the nation faced the Great Depression, the worst crisis since the Civil War.