מוטיבציה והתנהגות free operant

  • Published on
    12-Jan-2016

  • View
    53

  • Download
    0

DESCRIPTION

free operant. : 5. . (, , ) Actor Critic fMRI ODoherty+Dayan , Wightman+Phillips FSCV Discrete trial vs. Free operant - A/C ? Free operant - PowerPoint PPT Presentation

Transcript

  • free operant : 5

  • (, , ) Actor CriticfMRI ODoherty+Dayan, Wightman+Phillips FSCVDiscrete trial vs. Free operant -A/C? Free operant : ... : interval vs. ratio Free operant ...( )

  • :Markov Decision ProcessStatesActionsRewards

  • Actor-CriticPositive prediction error: Things are better than expected

    update value of state update policy (prob. of action)

    Negative prediction error: Things are worse than expected

    update value of state update policy

  • Actor-Critic : : Actor: dorsolateral striatumCritic: ventral striatum (NAC)

  • : - ODoherty et al. 2004 : rewarding; neutral : (High 60%, Low 30%) 1 ( ) -High reward, neutral 2 Yoked ( ), ( RT)

    ( , ?)

  • : - ODoherty et al. 2004(NAC) Ventral striatum PE :

    Dorsal striatum PE :

  • : - Roitman et al. 2004Fast scan cyclic voltammetry in striatumCue-elicited lever-pressing for sucrose at peak of DA burst(discrete trial: cueLPintraoral sucrose+FB tone)

    Cues elicit DA burst in trained but not untrained ratsCueDALP at DA peak

  • Corticostriatal synapses: 3 factor learningX1X2X3XNV1V2V3VNPStimulusRepresentationAdjustableConnections(weights)RPPTN?CortexStriatumVTA/SNcPredictionError (Dopamine)

  • ... ? , // (vigor) free operant

  • (Niv, Dayan, Joel)

  • ? :

  • ?how fast(+ eating time)

  • (actions -latencies) ('' '' )

  • ARL ? discounted sum of rewards :

    :

  • : RIreinforcements per hourmatching: response ratio = reinforcement ratio

  • ratio " :

    :

  • interval ratio: interval -ratio. ? ratio :

    interval -state -, :

    ...

  • ?

  • : , (directing) , .

    (energizing), drive , . .

  • ? '' LP (directing) ...

    -latency 'Other' (energizing)!

    RR25directing effectenergizing effect

  • ? (' ') '' , !

    Q(a,,S) = Rewards Costs + Future OpportunityReturnsCost

  • ? ( ) , " ' '

    : (, , ) (lesion, , ) (, , )

    ?

  • : =tonic dopamine: (, " ) ' '

  • : /Cousins, Atherton,Turner and Salamone (1996)

  • : CV, CU , '' .

    Dopamine lesion: ( -RT): , 42; 21" , ''

    Explain 3 factor learning rule and contrast to normal Hebbian learningHungry utility of reward enhanced threefold (20->60)

Recommended

View more >