Outline DRAFT Updated
I Introduction
II Rationale
o
The goal
of this project is to create a VR game in which participants catch a ball, and
there head, eye, and hand position/orientation are recorded. Using the
participant’s data, a reinforcement learning algorithm will be developed to
create a self-learning AI[TM3] . The AI’s performance will be compared to the
performance of the participants via simulation in a virtual space.
o
The goal
of this project is to create a VR game in which participants catch a ball, and
there head, eye, and hand position/orientation are used to develop a
self-learning AI, which performance will then be compared to a humans in the
task of intercepting a ball.
III Background
A.
Artificial
Neural Network (ANN)
§ An interconnected group of nodes termed “artificial neurons”,
sharing the output of one neuron to the input of another.
B.
Recurrent
neural network (RNN)
§
A RNN is a
class of ANN where connections between units/nodes form a directed cycle, allowing
dynamic temporal behavior.
C.
Dynamic
temporal behavior
§
The trajectory of states, in a
state space, followed by a system during a certain time interval[TM4] .
D.
Deep
learning
§ The application of ANNs to learning tasks that contain more than one hidden layer.
E.
Machine
learning
§
An application of artificial
intelligence (AI) that provides systems the ability to automatically learn and
improve from experience (by accessing data) without being explicitly programmed.
F.
Reinforcement
learning
§
A
subdivision of machine learning which allows software
agents (models) to automatically determine the ideal behavior within a specific
context, in order to maximize its performance. This is done by creating a reward
function[TM5] , allowing the agents to learn over millions of attempts, steadily increasing
reinforcement signals until the agents surpass human ability.
G.
Complexity
of eye movements & uses for navigation and coordination
§
Basic eye
movements can be defined in two categories; saccades, and fixations. Saccades
are rapid movement of the eye between fixation points, and a fixation are a point/location
the eyes concentrate on for an extended amount of time.
§
By using
these two motions, humans are able to navigate and perform complex movements autonomously[TM6] .
§
An example
In terms of non-navigation use, is when attempting to locate a moving target
which has escaped the eyes visual field, predictive saccades will be made in an
attempt to find the objects location of reappearance[TM7] .
H.
Virtual
Reality (VR) simulation
§
The VR
simulation consists of a 4-sided rectangular room 20m in length, a purple cone,
a yellow cylinder, 7 orange cylinders, a small green sphere, a purple vector,
and on the far side of the room a red ball.
§
The purple
cone represents the participant’s head, with the pointed end of the cone being
the front of the head. The yellow cylinder is where the participant placed the
racket to intercept the ball. The 7 orange cylinders are separate model
outputs, which are separated from each other by 5 frames. The small green
sphere is the gaze point, where the participant was looking towards. The purple
vector is a representation of where exactly the participant was looking in
respect to depth. The red ball was the focus point which the participant
attempted to intercept.
§
The ball
was thrown at each participant 135 times. The ball disappears at random
increments of 600, 800, and 1000ms, with the “blanking” lasting 500ms leaving
300, 400, or 500ms of post-blanking before the ball could be intercepted. This
blanking was to test if the participant could quickly intercept an object of
unknown positioning in a short period of time. The randomization of the
blanking was to disallow the development of pattern analysis, and force the
participant to predict where the ball would appear.
§
The
experiment was performed in a VR environment for multiple reasons such as; ease
of data collection, to retain the visual structure of the natural context,
control manipulation of the balls trajectory, artificial blanking of ball[TM8] , and the avoidance of uncontrollable variables
such as wind resistance, inaccurate throws, etc.
IV Method
1)
Participant
performs VR simulation by attempting to intercept the ball with the paddle. (go
into more detail[TM9] )
2)
Collect hand,
head, and gaze position/orientation from 10 subjects, aged 19-30 using equipment[TM10] with motion capture
markers connected to them. The equipment included; an Oculus DK2 head mounted display,
a 14 camera Phasespace X2 motion
capture system (75 Hz), a built-in SensoMotoric Instruments binocular eye
tracker (75Hz), and a Wilson badminton racket.
3)
Focuses on
creating multiple model (agent) outputs which contain various states (objective
dimensions), state a desired action (interception of ball using paddle), and
then create a reward function to optimize the policy.
4)
Create
media features for simulation presentation.
1.
Play,
pause, single frame forward, single frame backwards.
2.
Changing
camera viewpoint from fixed to head, attached to ball, and free camera.
3.
Creating
method of raising or lowering 1m in the air.
o
VI Discussion
o
Discuss
results and relate them to scope (papers I’ve been reading and how this can
help future robotics e.g. navigation, task management, grasp, etc.)
o
Make solid
conclusions of research (sum up presentation)
o
This research
could assist in the improvement of robotics AI development towards visual
recognition of a moving object, and the movements needed to successfully intercept
the object.
o
The
categorization of different eye movements
1.
Saccades: rapid
movement of the eye between fixation points
2.
Fixations:
concentrating the eyes directly on a point/location
3.
Smooth
pursuits: locking onto and following an objects movements fluidly
o
How human
eyes are used to navigate an environment and perform tasks.
1.
The use of
anchor point fixations, which allow the head to turn.
2.
How the eye
navigates towards the center of vision when turning.
3.
The eyes
prioritization of movement
4.
The eye
has a 150-200ms delay from perceived changes to the reaction of such changes to
the eyes. E.g. reacting to a moving target.
o
How to
interpret eye tracking data and label it accordingly to saccades, fixations,
blinks, and smooth pursuits.
o
A basic
understanding of coding using Python
o
How to use
Vizard as a VR simulator
o
[TM1]Due to my short-term memory problems some of the comments may be
directed as a reminder for me to look at later, be directed to the reviewer, or
both. If a comment is unclear please do not hesitate to contact me and ask
questions I will attempt to respond ASAP.
Thank you for reviewing!
PS: sorry if I ask something I’ve
already asked before I may have forgotten or simply looking for a formal
written explanation.
Presentation room: main auditorium
Timeframe: 10 min
Are there any terms that are not well
defined in the draft?
[TM2]I will detail this more when I have the rest of the structure
complete to ensure a coherent introduction to my presentation
[TM3]Missing anything? Which do you prefer? How could I improve the flow
of either statement?
[TM4]How can I explain these in laymen terms/how are these explained in
relation to parts in the study? E.g. what is our state space/system?
How should I orientate the background
info? I.e. what info should come first in relation to the others?
[TM5]Define?
[TM6]Place example of turning point action
[TM7]Then use visual comparison of how ball is tracked in RL using SMI (SensoMotoric Instruments) vs
VR
[TM8]Emphasize the inefficiency of the real life ball catching comparison
to enforce the claim.
[TM9]Do I state that the simulation was made with “Vizard VR toolkit”?
[TM10]Do the equipment remain here or should I create an “apparatus”
section? If not;
Do I need to state which physics
engine was used?
Do I need to state the degree of view
the subjects had while wearing the Oculus?
Do I need to state the calibration
specs?
[TM11]Will we have any solid numbers/graphs or a visual accuracy in the
models before my presentation? Or should I just show the simulation and detail
each part?
[TM12]Use the papers I’ve been analyzing to create a list of scopes this
research could benefit in robotics development
[M13]Find method of implementing into conclusion to avoid loss of flow in
presentation, only have to explain a little bit then say ect.
Comments
Post a Comment