› Forums › Personal Topics › Unbidden Thoughts › Updated Computational Model of NLP Semantics
This topic contains 13 replies, has 1 voice, and was last updated by
josh September 27, 2020 at 8:21 pm.
-
AuthorPosts
-
August 19, 2020 at 12:46 pm #61372

joshWhat does the probabilistic multi-dimensional space of meaning look like? Here is a simplified idea that is compatible with the Cooper paper:
Relations/Predicates are like disambiguated language predicates (i.e. “verbs”,”prepositions” of unambiguous type, sense, & argument applicability). The 3 main varieties are binary relations, unary relations, & propositional attitude predicates (they take a fact/sentence argument). Objects are can be abstract or concrete. A concrete object, in a situation, is a spatio-temporally connected matter/energy thing that is uniquely picked out by some set of describing predicates/relations in that situation – e.g. a particular day can be a Tuesday – “Tuesday” is not a proper noun in this semantics. Combinations of objects and relations have truth conditions in a situation, but these can be probabilistic. Situations are defined by sets of these facts that are taken to be completely true in that situation. The situation corresponding to “my experience of true reality” is special. Objects and predicates in other, hypothetical situations, are assumed to retain the same characteristics until some reason is given to change that. Which combinations are believed to be true in my reality & how to guess about unknown/untold combinations is learned/acquired from experience, while also involving many innate predispositions. If I take ‘My friend John’ to be a unique object & later learn that there were 2 or more spatially disconnected versions of that, then I need to revise my reality situation to reflect that new information. Propositional describe relationships between what holds in different situations – What I say about what Sam believes will happen after the next Presidential election is state men about hypothetical situation containing Sam’s beliefs, as described by me, in that hypothetical situation where the election has already occurred. Theories/guesses are involved in describing those contents – i.e. heuristic adjustments to what I believe & what I believe about how Sam’s belief s & reasoning differ from mine.
What is unsatisfying about this picture? There is a large, unpaid research debt involved in filling in the picture of what heuristics that allow novel situations, predicates, objects, & guesses to smoothly but intelligently stick to each other so that the entire edifice of communication & thought is both informative & computationally viable. Every combination of situation and predicate cannot be thought of as a completely independent mapping from objects to probability estimates. It’s necessary that most variations share structural inference forms & vary in smooth, regular ways. Model building should proceed from the principle that this is mostly true & gets modified where it isn’t. So the usual compression/structural approximation techniques for multidimensional spaces work most of the time, except for interesting boundaries – e.g. most predictions are off in the situation after the sun explodes. While I can learn to model how my beliefs typically differ from the average person or some special categories like a messianic cultist.
Non-concrete/abstract nouns include entities that are collections of concrete nouns, collections with special organizational status (e.g. a corporation or a sports team), and entities like “justice” are really abstract predicates that apply to collections of situations. The truth
conditions for statements about complex noun may be precise or vague – e.g. “SS# XXXX was a Google employee on 1/1/2020” is precise and can make reference to features of the physical environment vs “Wednesday is the draggiest day” is vague & can only make reference to judgments about samples of internal organic experience from some collection of people & that is arguably also true of the statement “Microsoft is a bully.” If we think about a sentence like “John Smith is a bully”, we could conceivably claim that the truth of this sentence involves unseen physical features of John Smith’s nervous & limbic system – i.e. even if he was mean to someone in the past, it might not be true of John Smith at this moment. However, the other interpretation that involves sampling reactions to John Smith’s behavior over a set of historical & imagined future situations is perhaps more psychologically compelling.Reflecting on the above, we propose that claim statements that are about a spatiotemporally localized event are different in character than those which are not. Among claims that do not have a spatiotemporally localized character, there are often multiple, competing styles of interpretation – one style may try to substitute a precise physical world set of criteria, while another style may be interpreted as an abstract sampling over hypothetical exemplar/test cases & these cases may include subjective attitude judgments from sampled participants.
-
September 3, 2020 at 7:37 pm #63215

joshA core goal of tech competency should be “cognitive”/computational capabilities related to understanding the relationship between 2 minimally different situations and sets of NLP sentences that have common presuppositions but describe the difference between the two situations – set A or true in situation 1 but not situation 2 and set B are true in situation 2 but not situation 1. Which additions/subtractions describe the transition from situation 1 to situation 2 and visa versa?
Underlying everyday communication is a lot of implicit knowledge about the “normal” flow of events (e.g. the “Yale Shooting Problem”). That knowledge must be learned/innate & then the elements of spoken language can use common assumptions about the defaults which don’t need to be spoken for understanding. Much of that implicit default can be thought of as normal “tensor flow” in some semantic space.
-
-
September 26, 2020 at 2:51 am #65827

joshThinking about how we could use distributed representations/tensor flow style learning as building blocks in a model of semantics based around the situation idea. The tensor flow models describe the result of applying “operators”. What are the operations & what are they applied to?
A Situation Frame is piece/chunk of some possible world, especially “the real world with it’s real history as *I/me* understands that”. Each frame has some fixed, structuring parts – a set of adjacent spatiotemporal locations, or another person’s mind/view, or a what-if premise, etc.
Moving between different frames is an example of an operation. Adding facts/parts/elements to a given frame is an operation. Moving our focus within a given frame – taking a closer look – is also an operation. Asking questions about each situation is an operation. We evaluate according to correct answers to questions & according to correct predictions about other testable predicates. I believe that we can productively shrug off many research headaches by being explicit that we are taking a functional/interface approach to these distributed representations of thought. How the mappings are computed/implemented is implementation – let the best implementations win. Our strong & weak semantics is in the system of transitions & it’s input/output behavior.
-
September 26, 2020 at 9:27 am #65830

joshConsider the experimental setup where the input training theoretically includes/encompasses a set of virtual world descriptions/events & a set of text corpora. Now pick a formal theory of inference that describes which conclusions are effectively computable from that set of inputs + a kind of innate endowment/learning bias (not a perfect theory, it’s just a parameter of our setup). Claims that are not inferrable & not contradictory are unknown/uncertain. Q&A performance is judged against that automatable standard. Experts can arbitrate select cases to decide if shortcomings in the learning or the inference theory are detected. A gap in the training corpora is not a bug.
Theories of engineered knowledge refinement make claims about how certain reps or trained neural nets or whatever have effectively summarized/embedded other corpora. When they are added in as a parameter, then we automate with that, while experts can also be aware of the possibility that the refinement system is a fault location in a failed Q/A test.
Our goal here is to describe the architecture of a way to do progressive refinement that can eventually achieve something like full linguistic/common sense knowledge without brittle tricks in the sense that the performance is backed by a rich weak semantics & a computable strong semantics that is brought to the digital/virtual world realm.
-
September 26, 2020 at 1:30 pm #65838

joshProbability distributions over all possible worlds is giant theoretical thing. Not a computable thing. We can compute with constructed situations representing bits of possible worlds where the situation contains some fixed truths, some probability, & some dynamics. Think about going about Q&A by constructing enough pieces of belief situation to answer the question. Think about operators as what you have learned about construction steps you can use.
-
-
September 26, 2020 at 10:37 am #65833

joshWe do believe in a strong semantics for all utterances that can reasonably be given truth conditions with strong semantics. Where Frege/Russell/Quine used private-in-the-mind German-English + inchoate theories of perception to describe their strong semantics we are using complex graphs in VR, with deep algebras -e.g. paths which arrive at the same location on a map of Smallville also do that in the model & do that in reality if the map is accurate, and we can compute with that geometry – and theories of machine perception of elements in the VR and theories of how that relates to human perception. Ultimately we can also develop models of how sensations feel to people & test whether a given model makes the right predictions about experiences & their linguistic reports.
-
September 26, 2020 at 11:54 am #65834

joshWhat are good strategies for constructively generating some next steps in the dimensions of technological achievement that we care about?
To improve mastery of the style of talking (fixing content) – generate sentences that linguistically describe the complete creation or modification of a VR scene/scenario in some formal way. Then ask a human tech to write that in a casual, colloquial way while viewing the scene, trying not to involve extra background knowledge. Then create casual Q&A about that with the same goals. Edit that for experimental correctness.
To improve mastery of content – create natural dialog on given topics, then try to whittle the dialog down into sub-sections the virtual background knowledge that is needed for adequate understanding can be supplied in the experimental context. Catalog discarded parts of the dialog that are hard to incorporate for future improvement efforts – e.g. need to know about how family relations normally behave or how a salt shaker is normally used.
-
September 26, 2020 at 1:41 pm #65839

joshMy philosophical fathers & grandfathers thought that we should begin our research by agreeing on the ISO format of situational representations. That sounded plausible to me as a student, but now I believe it is premature to try & do that. A false start. An imposition of symbolic form where numerical/distributed representations are sometimes dominant. We need to reject that research premise without rejecting the original need for semantics – getting & expressing true answers about the world through memory recall & reasoning & language. And also doing that for counterfactual alternatives for planning & for fiction & for understanding other minds/POV. We provisionally accept blackbox1, blackbox2, as candidate representations for internal semantics. Instead, we are going to be a controlling bully about testing that & building it up through incremental learning.
-
September 26, 2020 at 2:18 pm #65840

joshOne type of useful consolidation product will be the identification of sets of discourse + VR situations/catalogs of situations that are sufficient for learning various word forms. Sufficient in the sense that adding to the training corpus doesn’t add anything more to performance on a broad but targeted testing regime.
-
September 26, 2020 at 7:08 pm #65876

joshQ: What should we think about individuals, proper nouns, & the issue of extensionality vs. intentionality in this sort of framework?
A: What we accept as individuals can be a property of situations & types of situations rather than a global universal over the entire language/symbolic system – i.e. what we mean by object permanence dissipates when we consider individual molecules as objects. A situation will often include spatiotemporal connections between other sub-situations that are situations in their own right. Asserting object permanence for particular descriptions across those graph relations is an especially important type of knowledge that may have a substantial innate component for efficiency, though it could also be learned. The same may be said for “unique individual” & “this word names a unique individual”.
-
September 27, 2020 at 3:22 pm #65974

joshFor the situation of doing lots of computations with quantities that are like probabilities, where we are often interested in quantities like conditional probabilities, it probably makes sense to actually work with log probabilities internally.
-
September 27, 2020 at 8:09 pm #65989

joshIt will be worthwhile to try and connect dialog about computer hardware & software with the CASE tool efforts. That domain is in a sector with [mostly digital world fit],[high commercial value], [a hierarchy of levels of detailed description which ae well understood],[lots of online data]. Sometimes formal statements are used with N-tuple predicates & there is also a good methodological challenge to find the most natural & intuitive ways of describing “formal statement XXXX” in natural language.
-
September 27, 2020 at 8:21 pm #65990

joshJohn Sowa worked on creating symbolic formats for describing categories & corner cases of natural language meaning. In our context, his work is worth looking at to check the completeness of systems of positive & negative language uses that the system should handle and avoid/reject – e.g. “The fastest event weighed 10 pounds.” = buzz That perception does seem like it belongs to the realm of psychological models & the way we describe things. For some theoretical physics, it might be okay to circumscribe an event as a large set of molecular movements & associate them with some finite quantity of energy & calculate the mass of that energy & its weight in some given gravitational field. But clearly, in regular every day talking, that sentence is an error buzz, dismissed with the statement that “Events are not the kinds of things that have solid bodies & weights.”
-
AuthorPosts
You must be logged in to reply to this topic.