מוטיבציה והתנהגות free operant

  • Published on
    12-Jan-2016

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

free operant. : 5. . (, , ) Actor Critic fMRI ODoherty+Dayan , Wightman+Phillips FSCV Discrete trial vs. Free operant - A/C ? Free operant - PowerPoint PPT Presentation

Transcript

<ul><li><p> free operant : 5</p></li><li><p> (, , ) Actor CriticfMRI ODoherty+Dayan, Wightman+Phillips FSCVDiscrete trial vs. Free operant -A/C? Free operant : ... : interval vs. ratio Free operant ...( ) </p></li><li><p>:Markov Decision ProcessStatesActionsRewards</p></li><li><p> Actor-CriticPositive prediction error: Things are better than expected</p><p>update value of state update policy (prob. of action)</p><p>Negative prediction error: Things are worse than expected</p><p>update value of state update policy </p></li><li><p>Actor-Critic : : Actor: dorsolateral striatumCritic: ventral striatum (NAC) </p></li><li><p> : - ODoherty et al. 2004 : rewarding; neutral : (High 60%, Low 30%) 1 ( ) -High reward, neutral 2 Yoked ( ), ( RT)</p><p>( , ?)</p></li><li><p> : - ODoherty et al. 2004(NAC) Ventral striatum PE :</p><p>Dorsal striatum PE :</p></li><li><p> : - Roitman et al. 2004Fast scan cyclic voltammetry in striatumCue-elicited lever-pressing for sucrose at peak of DA burst(discrete trial: cueLPintraoral sucrose+FB tone)</p><p>Cues elicit DA burst in trained but not untrained ratsCueDALP at DA peak</p></li><li><p>Corticostriatal synapses: 3 factor learningX1X2X3XNV1V2V3VNPStimulusRepresentationAdjustableConnections(weights)RPPTN?CortexStriatumVTA/SNcPredictionError (Dopamine)</p></li><li><p> ... ? , // (vigor) free operant </p></li><li><p> (Niv, Dayan, Joel)</p></li><li><p> ? : </p></li><li><p> ?how fast(+ eating time)</p></li><li><p> (actions -latencies) ('' '' ) </p></li><li><p>ARL ? discounted sum of rewards : </p><p> :</p></li><li><p>: RIreinforcements per hourmatching: response ratio = reinforcement ratio</p></li><li><p> ratio " :</p><p>: </p></li><li><p> interval ratio: interval -ratio. ? ratio :</p><p> interval -state -, :</p><p> ...</p></li><li><p> ?</p></li><li><p> : , (directing) , .</p><p> (energizing), drive , . .</p></li><li><p> ? '' LP (directing) ...</p><p> -latency 'Other' (energizing)!</p><p>RR25directing effectenergizing effect</p></li><li><p> ? (' ') '' , !</p><p>Q(a,,S) = Rewards Costs + Future OpportunityReturnsCost</p></li><li><p> ? ( ) , " ' ' </p><p>: (, , ) (lesion, , ) (, , )</p><p> ?</p></li><li><p>: =tonic dopamine: (, " ) ' ' </p></li><li><p>: /Cousins, Atherton,Turner and Salamone (1996)</p></li><li><p> : CV, CU , '' .</p><p>Dopamine lesion: ( -RT): , 42; 21" , '' </p><p>Explain 3 factor learning rule and contrast to normal Hebbian learningHungry utility of reward enhanced threefold (20-&gt;60)</p></li></ul>

Recommended

View more >