Letters on the Philosophy of the Human Mind (Volume 1)

Free download. Book file PDF easily for everyone and every device. You can download and read online Letters on the Philosophy of the Human Mind (Volume 1) file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Letters on the Philosophy of the Human Mind (Volume 1) book. Happy reading Letters on the Philosophy of the Human Mind (Volume 1) Bookeveryone. Download file Free Book PDF Letters on the Philosophy of the Human Mind (Volume 1) at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Letters on the Philosophy of the Human Mind (Volume 1) Pocket Guide.

Now that somebody has, we know it scientifically. Would that seriously be a reason to doubt that there are such mental states? Or that they are mental states of different kinds? Or that the brain must be somehow essentially involved in both? But I had supposed that dualistic metaphysics was now out of fashion, in the brain science community most of all.

Brain scientists are supposed to be materialists, and materialists are supposed not to doubt that distinct mental states have ipso facto got different neural counterparts. That being so, why does it matter where in the brain their different counterparts are? It might turn out, for example, that the neural loci of similar kinds of mental process are pretty reliably spatially propinquitous as, indeed, the phrenologists generally supposed.

I once gave a perfectly awful cognitive science lecture at a major centre for brain imaging research. The main project there, as best I could tell, was to provide subjects with some or other experimental tasks to do and take pictures of their brains while they did them. The lecture was followed by the usual mildly boozy dinner, over which professional inhibitions relaxed a bit. I kept asking, as politely as I could manage, how the neuroscientists decided which experimental tasks it would be interesting to make brain maps for. Their idea was apparently that experimental data are, ipso facto, a good thing; and that experimental data about when and where the brain lights up are, ipso facto, a better thing than most.

I guess I must have been unsubtle in pressing my question because, at a pause in the conversation, one of my hosts rounded on me. Jerry Fodor LRB , 30 September asks why learning where the carburettor is should be thought to help us understand how an engine works. By itself it doesn't, but it would be useful to discover that it's connected to the inlet port.

This is perhaps an argument for closer links between functional neuro-imaging and traditional neuro-anatomy. Jerry Fodor's provocative dismissal of brain imaging LRB , 30 September scores some palpable hits, but misses the main target. Yes, the false colour images are marvellously seductive but by the time you see them they have been so massaged as to risk being thoroughly misleading.

Yes, for most of their brief history the scanners have been either toys for physicists or tools for clinicians. For the former, the technology on its own has been sufficient. For the latter, the images are aids to diagnosis and surgery. For neither has it been necessary to ask the questions about meaning which Fodor properly proposes. However, for a proponent of the case for the modularity of mind to argue that understanding the dynamics of brain processes is an empirical irrelevancy is pretty cheeky, and my guess is that he knows this perfectly well.

Modular minds are at best hypothetical and somewhat improbable, but evidence for the modularity of aspects of brain processes — vision is a good example — is strong. A theory which integrates brain and mind processes will be a major goal for neuroscientists, psychologists and philosophers in the coming decades.

It will need to understand both the particularities of the micromechanisms of nerve cells and their interactions and the dynamics of the system as a whole. The scanners have a vital part to play in providing that understanding. But the first thing we need to do is to dismiss the idea that because a particular brain region is active when, to use Fodor's example, we think about teapots, this means we have located a teapot storage site in the brain. All we have done is found part of a system enabling us to think teapots.

In their less gung-ho moments the scanner enthusiasts know this perfectly well, as in his less cynical mode does the rationalist Fodor. The rest of us trying to weld empirical data with satisfactory brain theory need both. It is my great pleasure to introduce Jerry Fodor to himself. In a critique of the status and prospects of evolutionary psychology LRB , 22 January , Fodor wrote:.

Adicionando ao Carrinho...

And about this, exactly nothing is known. Nobody even knows which brain structures our cognitive capacities depend on. A practical solution would be to investigate that relationship. If only we had some way of determining which brain structures support cognition! What if we could measure, in controlled experiments, changes in brain states that result from cognitive changes?

In point of fact, brain imaging not only provides a way to identify brain structures that support cognition when those structures are modular and local, it also offers something much more powerful: a way to study the brain basis of cognition when it depends on a subtle interplay among brain regions, or the detailed action of brain physiology. If Fodor had aimed his critique at the bizarre but all too common view that there is a localised brain module that turns on or off for each distinct class of thoughts in a fantasy taxonomy of cognition, he would have had me and a number of other dissident neuro-imagers rallying to his cry.

But of course that hardly seems likely, given that Fodor has in the past been such a champion of modularity. Now, suppose that a modeler set the activation values across the input units that is, encodes an input vector of our unit network so that some units take on an activation level of 1 and others take on a value of 0. In order to determine what the value of a single output unit would be, one would have to perform the procedure just described that is, calculate the net influence and pass it through an activation function.

To determine what the entire output vector would be, one must repeat the procedure for all output units. There are, however, countless other sorts of information that can be encoded in terms of unit activation levels. Our goal might be to construct a model that correctly classifies animals on the basis of their features.

We might begin by creating a list a corpus that contains, for each animal, a specification of the appropriate input and output vectors. The challenge is then to set the weights on the connections so that when one of these input vectors is encoded across the input units, the network will activate the appropriate animal unit at the output layer.

Setting these weights by hand would be quite tedious given that our network has weighted connections. Researchers would discover, however, that the process of weight assignment can be automated. The rule states that the weight on a connection from input unit i to output unit u is to be changed by an amount equal to the product of the activation value of i , the activation value of u , and a learning rate. Let us suppose, for the sake of illustration, that our unit network started out life with connection weights of 0 across the board. We might then take an entry from our corpus of input-output pairs say, the entry for donkeys and set the input and output values accordingly.

Given a corpus of entries and at 10, applications of the rule per entry, a total of 1,, applications of the rule would be required for just one pass through the corpus called an epoch of training. Here, clearly, the powerful number-crunching capabilities of electronic computers become essential. Let us assume that we have set the learning rate to a relatively high value and that the network has received one epoch of training. What we will find is that if a given input pattern from the training corpus is encoded across the input units, activity will propagate forward through the connections in such a way as to activate the appropriate output unit.

That is, our network will have learned how to appropriately classify input patterns. As a point of comparison, the mainstream approach to artificial intelligence AI research is basically an offshoot of traditional forms of computer programming. Computer programs manipulate sentential representations by applying rules which are sensitive to the syntax roughly, the shape of those sentences.

Although this is a vast oversimplification, it does highlight a distinctive feature of the classical approach to AI, which is the assumption that cognition is effected through the application of syntax-sensitive rules to syntactically structured representations. What is distinctive about many connectionist systems is that they encode information through activation vectors and weight vectors , and they process that information when activity propagates forward through many weighted connections.

In addition, insofar as connectionist processing is in this way highly distributed that is, many processors and connections simultaneously shoulder a bit of the processing load , a network will often continue to function even if part of it gets destroyed if connections are pruned. The same kind of parallel and distributed processing where many processors and connections are shouldering a bit of the processing load simultaneously that enables this kind of graceful degradation also allows connectionist systems to respond sensibly to noisy or otherwise imperfect inputs.

Traditional forms of computer programming, on the other hand, have a much greater tendency to fail or completely crash due to even minor imperfections in either programming code or inputs. The advent of connectionist learning rules was clearly a watershed event in the history of connectionism. It made possible the automation of vast numbers of weight assignments, and this would eventually enable connectionist systems to perform feats that McCulloch and Pitts could scarcely have imagined. Particularly damaging is the fact that the learning of one input-output pair an association will in many cases disrupt what a network has already learned about other associations, a process known as catastrophic interference.

Such shortcomings led researchers to investigate new learning rules, one of the most important being the delta rule. To train our network using the delta rule, we it out with random weights and feed it a particular input vector from the corpus. Activity then propagates forward to the output layer.

John Locke

Afterwards, for a given unit u at the output layer, the network takes the actual activation of u and its desired activation and modifies weights according to the following rule:. That is, to modify a connection from input i to output u , the delta rule computes the product of the difference between the desired activation of u and the actual activation the error score , the activation of i , and a typically very small learning rate.

Thus, assuming that unit u should be fully active but is not and input i happens to be highly active, the delta rule will increase the strength of the connection from i to u. This will make it more likely that the next time i is highly active, u will be too. If, on the other hand, u should have been inactive but was not, the connection from i to u will be pushed in a negative direction.

The next item on the corpus is then input to the network and the process repeats, until the entire corpus or at least that part of it that the researchers want the network to encounter has been run through. Famed connectionist Frank Rosenblatt called networks of the sort lately discussed perceptrons. He also proved the foregoing truth about them, which became known as the perceptron convergence theorem. Rosenblatt was very much concerned with the abstract information-processing powers of connectionist systems, but others, like Oliver Selfridge , were investigating the ability of connectionist systems to perform specific cognitive tasks, such as recognizing handwritten letters.

Connectionist models began around this time to be implemented with the aid of Von Neumann devices, which, for reasons already mentioned, proved to be a blessing. There was much exuberance associated with connectionism during this period, but it would not last long. Many point to the publication of Perceptrons by prominent classical AI researchers Marvin Minsky and Seymour Papert as the pivotal event. Minsky and Papert showed among other things that perceptrons cannot learn some sets of associations.

The simplest of these is a mapping from truth values of statements p and q to the truth value of p XOR q where p XOR q is true, just in case p is true or q is true but not both. No set of weights will enable a simple two-layer feed-forward perceptron to compute the XOR function. The fault here lies largely with the architecture, for feed-forward networks with one or more layers of hidden units intervening between input and output layers see Figure 4 can be made to perform the sorts of mappings that troubled Minsky and Papert. However, these critics also speculated that three-layer networks could never be trained to converge upon the correct set of weights.

Connectionists found themselves at a major competitive disadvantage, leaving classicists with the field largely to themselves for over a decade. This rule, which is still the backbone of contemporary connectionist research, enables networks with one or more layers of hidden units to learn how to perform sets of input-output mappings of the sort that troubled Minsky and Papert. The generalized delta rule works roughly the same way for the layer of connections running from the final layer of hidden units to the output units.

For a connection running into a hidden unit, the rule calculates how much the hidden unit contributed to the total error signal the sum of the individual output unit error signals rather than the error signal of any particular unit. This process can be repeated for networks of varying depth. Put differently, the generalized delta rule enables backpropagation learning , where an error signal propagates backwards through multiple layers in order to guide weight modifications. Figure 4: Three-layer Network [Created using Simbrain 2. Connectionism sprang back onto the scene in with a monumental two-volume compendium of connectionist modeling techniques volume 1 and models of psychological processes volume 2 by David Rumelhart, James McClelland and their colleagues in the Parallel Distributed Processing PDP research group.

Each chapter of the second volume describes a connectionist model of some particular cognitive process along with a discussion of how the model departs from earlier ways of understanding that process. It included models of schemata large scale data structures , speech recognition, memory, language comprehension, spatial reasoning and past-tense learning.

Alongside this compendium, and in its wake, came a deluge of further models. Although this new breed of connectionism was occasionally lauded as marking the next great paradigm shift in cognitive science, mainstream connectionist research has not tended to be directed at overthrowing previous ways of thinking. Rather, connectionists seem more interested in offering a deeper look at facets of cognitive processing that have already been recognized and studied in disciplines like cognitive psychology, cognitive neuropsychology and cognitive neuroscience.

What are highly novel are the claims made by connectionists about the precise form of internal information processing. Before getting to those claims, let us first discuss a few other connectionist architectures. Over the course of his investigation into whether or not a connectionist system can learn to master the complicated grammatical principles of a natural language such as English, Jeffrey Elman helped to pioneer a powerful, new connectionist architecture, sometimes known as an Elman net.

To produce and understand such a sentence requires one to be able to determine subject-verb agreements across the boundaries of multiple clauses by attending to contextual cues presented over time. All of this requires a kind of memory for preceding context that standard feed-forward connectionist systems lack.

In its simplest form, an input is presented to the network and activity propagates forward to the hidden layer.

On the next step or cycle of processing, the hidden unit vector propagates forward through weighted connections to generate an output vector while at the same time being copied onto a side layer of context units. When the second input is presented the second word in a sentence, for example , the new hidden layer activation is the product of both this second input and activity in the context layer — that is, the hidden unit vector now contains information about both the current input and the preceding one.

The hidden unit vector then produces an output vector as well as a new context vector. When the third item is input, a new hidden unit vector is produced that contains information about all of the previous time steps, and so on. Indeed, his networks are able to form highly accurate predictions regarding which words and word forms are permissible in a given context, including those that involve multiple embedded clauses. While Chomsky has continued to self-consciously advocate a shift back towards the nativist psychology of the rationalists, Elman and other connectionists have at least bolstered the plausibility of a more austere empiricist approach.

Connectionism is, however, much more than a simple empiricist associationism, for it is at least compatible with a more complex picture of internal dynamics. For one thing, to maintain consistency with the findings of mainstream neuropsychology, connectionists ought to and one suspects that most do allow that we do not begin life with a uniform, amorphous cognitive mush.

Rather, as mentioned earlier, the cognitive load may be divided among numerous, functionally distinct components. Moreover, even individual feed-forward networks are often tasked with unearthing complicated statistical patterns exhibited in large amounts of data. An indication of just how complicated a process this can be, the task of analyzing how it is that connectionist systems manage to accomplish the impressive things that they do has turned out to be a major undertaking unto itself see Section 5.

There are, it is important to realize, connectionist architectures that do not incorporate the kinds of feed-forward connections upon which we have so far concentrated. For instance, McClelland and Rumelhart's interactive activation and competition IAC architecture and its many variants utilize excitatory and inhibitory connections that run back and forth between the units in different groups. In IAC models, weights are hard-wired rather than learned and units are typically assigned their own particular, fixed meanings. When a set of units is activated so as to encode some piece of information, activity may shift around a bit, but as units compete with one another to become most active through inter-unit inhibitory connections activity will eventually settle into a stable state.

The stable state may be viewed, depending upon the process being modeled, as the network's reaction to the stimulus, which, depending upon the process being modeled, might be viewed as a semantic interpretation, a classification or a mnemonic association. The IAC architecture has proven particularly effective at modeling phenomena associated with long-term memory content addressability, priming and language comprehension, for instance.

The connection weights in IAC models can be set in various ways, including on the basis of individual hand selection, simulated evolution or statistical analysis of naturally occurring data for example, co-occurrence of words in newspapers or encyclopedias Kintsch An architecture that incorporates similar competitive processing principles, with the added twist that it allows weights to be learned, is the self-organizing feature map SOFM see Kohonen ; see also Miikkulainen SOFMs learn to map complicated input vectors onto the individual units of a two-dimensional array of units.

Unlike feed-forward systems that are supplied with information about the correct output for a given input, SOFMs learn in an unsupervised manner. Training consists simply in presenting the model with numerous input vectors. In principle, nothing more complicated than a Hebbian learning algorithm is required to train most SOFMs. SOFMs were coming into their own even during the connectionism drought of the s, thanks in large part to Finnish researcher Tuevo Kohonen. Ultimately it was found that with proper learning procedures, trained SOFMs exhibit a number of biologically interesting features that will be familiar to anyone who knows a bit about topographic maps for example, retinotopic, tonotopic and somatotopic in the mammalian cortex.

SOFMs tend not to allow a portion of the map go unused; they represent similar input vectors with neighboring units, which collectively amount to a topographic map of the space of input vectors; and if a training corpus contains many similar input vectors, the portion of the map devoted to the task of discriminating between them will expand, resulting in a map with a distorted topography.

SOFMs have even been used to model the formation of retinotopically organized columns of contour detectors found in the primary visual cortex Goodhill SOFMs thus reside somewhere along the upper end of the biological-plausibility continuum. Here we have encountered just a smattering of connectionist learning algorithms and architectures, which continue to evolve. Indeed, despite what in some quarters has been a protracted and often heated debate between connectionists and classicists discussed below , many researchers are content to move back and forth between, and also to merge, the two approaches depending upon the task at hand.

Connectionist systems generally learn by detecting complicated statistical patterns present in huge amounts of data. This often requires detection of complicated cues as to the proper response to a given input, the salience of which often varies with context. This can make it difficult to determine precisely how a given connectionist system utilizes its units and connections to accomplish the goals set for it. One common way of making sense of the workings of connectionist systems is to view them at a coarse, rather than fine, grain of analysis -- to see them as concerned with the relationships between different activation vectors, not individual units and weighted connections.

Consider, for instance, how a fully trained Elman network learns how to process particular words. One way of determining that this is the case is to begin by conceiving activation vectors as points within a space that has as many dimensions as there are units. For instance, the activation levels of two units might be represented as a single point in a two-dimensional plane where the y axis represents the value of the first unit and the x axis represents the second unit.

This is called the state space for those units. Thus, if there are two units whose activation values are 0. The activation levels of three units can be represented as the point in a cube where the three values intersect, and so on for other numbers of units. Of course, there is a limit to the number of dimensions we can depict or visualize, but there is no limit to the number of dimensions we can represent algebraically.

Thus, even where many units are involved, activation vectors can be represented as points in high-dimensional space and the similarity of two vectors can be determined by measuring the proximity of those points in high-dimensional state space. This, however, tells us nothing about the way context determines the specific way in which networks represent particular words. Other techniques for example, principal components analysis and multidimensional scaling have been employed to understand such subtleties as the context-sensitive time-course of processing.

One of the interesting things revealed about connectionist systems through these sorts of techniques has been that networks which share the same connection structure but begin training with different random starting weights will often learn to perform a given task equally well and to do so by partitioning hidden unit space in similar ways. At this point, we are also in a good position to understand some differences in how connectionist networks code information. In the simplest case, a particular unit will represent a particular piece of information — for instance, our hypothetical network about animals uses particular units to represent particular features of animals.

This is called a localist encoding scheme. In other cases an entire collection of activation values is taken to represents something — for instance, an entire input vector of our hypothetical animal classification network might represent the characteristics of a particular animal. This is a distributed coding scheme at the whole animal level, but still a local encoding scheme at the feature level. When we turn to hidden-unit representations, however, things are often quite different. Hidden-unit representations of inputs are often distributed without employing localist encoding at the level of individual units.

That is, particular hidden units often fail to have any particular input feature that they are exclusively sensitive to. Rather, they participate in different ways in the processing of many different kinds of input. This is called coarse coding , and there are ways of coarse coding input and output patterns as well. The fact that connectionist networks excel at forming and processing these highly distributed representations is one of their most distinctive and important features.

Also important is that connectionist models often excel at processing novel input patterns ones not encountered during training appropriately. Successful performance of a task will often generalize to other related tasks. This is because connectionist models often work by detecting statistical patterns present in a corpus of input-output pairs, for instance.

They learn to process particular inputs in particular ways, and when they encounter inputs similar to those encountered during training they process them in a similar manner. After training, they could do this very well even for sentence parts they ha not encountered before. Consequently, in such cases performance tends not to generalize to novel cases very well. As we have seen, connectionist networks have a number of desirable features from a cognitive modeling standpoint.

There are, however, also serious concerns about connectionism. One is that connectionist models must usually undergo a great deal of training on many different inputs in order to perform a task and exhibit adequate generalization. Nor is there much need to fear that subsequent memories will overwrite earlier ones, a process known in connectionist circles as catastrophic interference.

Unfortunately, many though not all connectionist networks namely many back-propagation networks fail to exhibit one-shot learning and are prone to catastrophic interference. Another worry about back-propagation networks is that the generalized delta rule is, biologically speaking, implausible. It certainly does look that way so far, but even if the criticism hits the mark we should bear in mind the difference between computability theory questions and learning theory questions. In the case of connectionism, questions of the former sort concern what sorts of things connectionist systems can and cannot do and questions of the latter address how connectionist systems might come to learn or evolve the ability to do these things.

The back-propagation algorithm makes the networks that utilize them implausible from the perspective of learning theory, not computability theory. It should, in other words, be viewed as a major accomplishment when a connectionist network that utilizes only biologically plausible processing principles , activation thresholds and weighted connections is able to perform a cognitive task that had hitherto seemed mysterious. It constitutes a biologically plausible model of the underlying mechanisms regardless of whether or not it came possess that structure through hand-selection of weights, Hebbian learning, back-propagation or simulated evolution.

The classical conception of cognition was deeply entrenched in philosophy namely in empirically oriented philosophy of mind and cognitive science when the connectionist program was resurrected in the s. Nevertheless, many researchers flocked to connectionism, feeling that it held much greater promise and that it might revamp our common-sense conception of ourselves.

During the early days of the ensuing controversy, the differences between connectionist and classical models of cognition seemed to be fairly stark. Connectionist networks learned how to engage in the parallel processing of highly distributed representations and were fault tolerant because of it. Classical systems were vulnerable to catastrophic failure due to their reliance upon the serial application of syntax-sensitive rules to syntactically structured sentence-like representations. Connectionist systems superimposed many kinds of information across their units and weights, whereas classical systems stored separate pieces of information in distinct memory registers and accessed them in serial fashion on the basis of their numerical addresses.

Perhaps most importantly, connectionism promised to bridge low-level neuroscience and high-level psychology. Classicism, by contrast, lent itself to dismissive views about the relevance of neuroscience to psychology. The basic idea here is that if the mind is just a program being run by the brain, the material substrate through which the program is instantiated drops out as irrelevant.

  • Inner Speech: New Voices // Reviews // Notre Dame Philosophical Reviews // University of Notre Dame?
  • Wasted Years: (Resnick 5) (Charlie Resnick series).
  • The Box-Wallah (The Bagshott Trilogy Book 3);
  • Learn more about our specialized publishing options.
  • Catharine Macaulay (Stanford Encyclopedia of Philosophy).

After all, computationally identical computers can be made out of neurons, vacuum tubes, microchips, pistons and gears, and so forth, which means that computer programs can be run on highly heterogeneous machines. Neural nets are but one of these types, and so they are of no essential relevance to psychology. On the connectionist view, by contrast, human cognition can only be understood by paying considerable attention to kind of physical mechanism that instantiates it.

Although these sorts of differences seemed fairly stark in the early days of the connectionism-classicism debate, proponents of the classical conception have recently made great progress emulating the aforementioned virtues of connectionist processing. For instance, classical systems have been implemented with a high degree of redundancy, through the action of many processors working in parallel, and by incorporating fuzzier rules to allow for input variability.

We should also not lose sight of the fact that classical systems have virtually always been capable of learning. They have, in particular, long excelled at learning new ways to efficiently search branching problem spaces. That said, connectionist systems seem to have a very different natural learning aptitude — namely, they excel at picking up on complicated patterns, sub-patterns, and exceptions, and apparently without the need for syntax-sensitive inference rules.

This claim has, however, not gone uncontested. What these researchers claimed to have shown was that over the course of learning how to produce past-tense forms of verbs, their connectionist model naturally exhibited the same distinctive u-shaped learning curve as children. Lastly, performance increases as the child learns both the rules and their exceptions.

What Rumelhart and McClelland attempted to show was that this sort of process need not be underwritten by mechanisms that work by applying physically and functionally distinct rules to representations. Instead, all of the relevant information can be stored in superimposed fashion within the weights of a connectionist network really three of them linked end-to-end. Pinker and Prince , however, would charge inter alia that the picture of linguistic processing painted by Rumelhart and McClelland was extremely simplistic and that their training corpus was artificially structured namely, that the proportion of regular to irregular verbs varied unnaturally over the course of training so as to elicit u-shaped learning.

Plunkett and Marchman went a long way towards remedying the second apparent defect, though Marcus complained that they did not go far enough since the proportion of regular to irregular verbs was still not completely homogenous throughout training. As with most of the major debates constituting the broader connectionist-classicist controversy, this one has yet to be conclusively resolved.


Nevertheless, it seems clear that this line of connectionist research does at least suggest something of more general importance — namely, that an interplay between a structured environment and general associative learning mechanisms might in principle conspire so as to yield complicated behaviors of the sort that lead some researchers to posit inner classical process. Some connectionists also hope to challenge the classical account of concepts , which embody knowledge of categories and kinds.

Membership conditions of this sort would give concepts a sharp, all-or-none character, and they naturally lend themselves to instantiation in terms of formal inference rules and sentential representations. Instead, their referents bear a much looser family resemblance relation to one another. For instance, the ability to fly is more frequently encountered in birds than is the ability to swim, though neither ability is common to all birds.

On the prototype view and also on the closely related exemplar view , category instances are thought of as clustering together in what might be thought of as a hyper-dimensional semantic space a space in which there are as many dimensions as there are relevant features. In this space, the prototype is the central region around which instances cluster exemplar theory essentially does away with this abstract region, allowing only for memory of actual concrete instances.

This way of thinking about concepts has, of course, not gone unchallenged see Rey and Barsalou for two very different responses. Neuroscientist Patricia Churchland and philosopher Paul Churchland have argued that connectionism has done much to weaken the plausibility of our pre-scientific conception of mental processes our folk psychology. The classical conception of cognition is, accordingly, viewed as a natural spinoff of this folk theory. The Churchlands maintain that neither the folk theory nor the classical theory bears much resemblance to the way in which representations are actually stored and transformed in the human brain.

What leads many astray, say Churchland and Sejnowski , is the idea that the structure of an effect directly reflects the structure of its cause as exemplified by the homuncular theory of embryonic development. Thus, many mistakenly think that the structure of the language through which we express our thoughts is a clear indication of the structure of the thoughts themselves.

by Bailey, Samuel

The Churchlands think that connectionism may afford a glimpse into the future of cognitive neuroscience, a future wherein the classical conception is supplanted by the view that thoughts are just points in hyper-dimensional neural state space and sequences of thoughts are trajectories through this space see Churchland A more moderate position on these issues has been advanced by Daniel Dennett who largely agrees with the Churchlands in regarding the broadly connectionist character of our actual inner workings. He also maintains, however, that folk psychology is for all practical purposes indispensible.

It enables us to adopt a high-level stance towards human behavior wherein we are able to detect patterns that we would miss if we restricted ourselves to a low-level neurological stance. In the same way, he claims, one can gain great predictive leverage over a chess-playing computer by ignoring the low-level details of its inner circuitry and treating it as a thinking opponent. The chess expert wisely forsakes some accuracy in favor of a large increase in efficiency when he treats the machine as a thinking opponent, an intentional agent.

Dennett maintains that we do the same when we adopt an intentional stance towards human behavior. Thus, although neuroscience will not discover any of the inner sentences putatively posited by folk psychology, a high-level theoretical apparatus that includes them is an indispensable predictive instrument.

However, perhaps neither Dennett nor McCauley are being entirely fair to the Churchlands in this regard. What the Churchlands foretell is the elimination of a high-level folk theory in favor of another high-level theory that emanates out of connectionist and neuroscientific research. Connectionists, we have seen, look for ways of understanding how their models accomplish the tasks set for them by abstracting away from neural particulars.

The Churchlands, one might argue, are no exception. Their view that sequences are trajectories through a hyperdimensional landscape abstracts away from most neural specifics, such as action potentials and inhibitory neurotransmitters. When connectionism reemerged in the s, it helped to foment resistance to both classicism and folk psychology. In response, stalwart classicists Jerry Fodor and Zenon Pylyshyn formulated a trenchant critique of connectionism. One imagines that they hoped to do for the new connectionism what Chomsky did for the associationist psychology of the radical behaviorists and what Minsky and Papert did for the old connectionism.

They did not accomplish that much, but they did succeed in framing the debate over connectionism for years to come. Though their criticisms of connectionism were wide-ranging, they were largely aimed at showing that connectionism could not account for important characteristics of human thinking, such as its generally truth-preserving character, its productivity, and most important of all its systematicity. Of course they had no qualms with the proposal that vaguely connectionist-style processes happen, in the human case, to implement high-level, classical computations.


Arts & Letters Daily - ideas, criticism, debate

On their view, human thinking involves the rule-governed formulation and manipulation of sentences in an inner linguistic code sometimes called mentalese. For instance, from the belief that the ATM will not give you any money and the belief that it gave money to the people before and after you in line, you might reasonably form a new belief that there is something wrong with either your card or your account.

Indeed, given a historical context in which philosophers throughout the ages frequently decried the notion that any mechanism could engage in reasoning, it is no small matter that early work in AI yielded the first fully mechanical models and perhaps even artificial implementations of important facets of human reasoning. On the classical conception, this can be done through the purely formal, syntax-sensitive application of rules to sentences insofar as the syntactic properties mirror the semantic ones.

This would, on their view, render connectionism a sub-cognitive endeavor. One way connectionists could respond to this challenge would be to create connectionist systems that support truth-preservation without any reliance upon sentential representations or formal inference rules. Bechtel and Abrahamson explore another option, however, which is to situate important facets of rationality in human interactions with the external symbols of natural and formal languages.

This proposal is backed by a pair of connectionist models that learn to detect patterns during the construction of formal deductive proofs and to use this information to decide on the validity of arguments and to accurately fill in missing premises. To better understand the nature of their concerns, it might help if we first consider the putative productivity and systematicity of natural languages.

The rules governing English appear to license 1 , but not 2 , which is made from modulo capitalization qualitatively identical parts:. We who are fluent in some natural language have knowledge of the rules that govern the permissible ways in which the basic components of that language can be arranged — that is, we have mastery of the syntax of the language.

Sentences are, of course, also typically intended to carry or convey some meaning. Thus 3 , which is made from the same constituents as 1 , conveys a very different meaning.