Pages

A blog about teaching Programming to non-CompSci students by Tim Love (Cambridge University Engineering Department). I do not speak on behalf of the university, the department, or even the IT group I belong to.

Thursday, 27 March 2014

Some educational myths

Well, not myths really, but it's sometimes worth challenging cherished notions

Deep down, students know what's good for them

In "Student interest and choice in programming assignments" (Journal of computing in small colleges 26, 6 (June 2011)) Lisa Torrey surveyed 14 students to find out what motivates their choice of programming exercises. She'd hoped that students would choose programs at the optimal level of challenge but found that "students disproportionately chose to write less challenging programs than their interest patterns had suggested". She felt that "many students would choose easy programs that happen to contain other interesting factors".

In March 2014's "Assessment and Evaluation in Higher Education", Felton and Mitchell claim (providing some evidence) that "Faculty who lighten workloads and inflate grades buy high SET [Student Evaluation of Teaching] ratings and popularity for their courses". Older staff are less prone to doing this.

The more a topic's taught, the more students will learn

In "When do students learn? Investigating factors in introductory courses" (Journal of computing in small colleges, 2012) it said - "we found that instructional time spent on a topic often has a far weaker connection to student learning levels than does instructor emphasis. ... Just spending more classroom teaching time on a concept will not improve student learning as much as an instructor placing greater emphasis on that concept ... For CS1, there were few topics for which there were statistically significant correlations between instructional time and student learning".

The better the teacher's presentation, the more the students learn

In the May 2013 issue of "Psychonomic Bulletin and Review" it reported that "When a presenter is seen to handle complicated information effortlessly, students sense wrongly that they too have acquired a firm grasp of the material". They're more confident, but perform no better.

In "The Times Higher" (30/5/13, p.7) it's reported that "lecture fluency did not significantly affect the amount of information learnt".

Clarity is good

In The secret life of fluency Daniel Oppenheimer wrote that for some exercises, "participants were significantly more likely to detect the error when the question was written in a difficult-to-read font. This suggests that they were adopting a more systematic processing method and attending more carefully to the details of the question".

Monday, 30 December 2013

Supporting LaTeX

When I began working in a university, part of my job was supporting a group who used LaTeX. They used Unix systems, and I maintained LaTeX on them. Installing could sometimes be a slog - fonts were a hassle, and packages kept appearing and re-appearing with inter-dependencies. I produced some handouts to help people with LaTeX. LaTeX2e appeared, which helped.

Time passed. The web was invented, so I put the handouts online, first as postscript docs, then PDF, then HTML/MathML. A web search for Tim Love LaTeX reveals that those docs have been widely copied. Documents like The Not So Short Introduction to LaTeX2e by Tobias Oetiker et al, LaTeX for Complete Novices by Nicola L.C. Talbot and Using Imported Graphics in LaTeX and pdfLaTeX by Keith Reckdahl have taken away the need for books, though the TeX Book is still useful.

As Word improved, LaTeX use seemed to recede, recovering as Linux appeared, LaTeX distributions became more stable, and cross-platform front ends like Kile and Texmaker were developed. pdfLaTeX became the predominant latex processor, DVI files becoming a rarity. As web pages grew in sophistication, LaTeX->HTML convertors became less fashionable (I used to generate PDF and HTML files from LaTeX masters, but tend to maintain the files separately now). The LaTeX page in our help system grew. I started giving talks on LaTeX for beginners and for report writers. Some staff made their undergraduate students learn LaTeX.

The CTAN sites became more comprehensive. A searchable catalogue appeared. Usenet newsgroups became Web forums, and sites like latex-community and tex-latex stackexchange attracted beginners and experts.

The LaTeX community has always been mutually-supportive and widely dispersed. Local support is much less necessary than it used to be, but sometimes it helps provide continuity. A Ph.D student who'd been to my talks and had read my handouts produced a class to support local thesis writers and left it with me when he left. It proved popular - our 3rd most popular help-system page. Another student who'd been to a talk improved the class in 2013 - it's available via our help system.

Thursday, 19 December 2013

Space, time, and C++ source code

In "Scientific American", December 2013, it said of reading texts in general that "When we read, we construct a mental representation of the text that is similar to the mental maps we create of terrain and indoor spaces". Students new to programming may have trouble when facing source code if they create inappropriate maps. I think that the more linear (more like prose) the code is, the easier it is for these students to understand. If execution starts at the top of the file, and ends at the end, so much the better.

Some deviations from linearity are fairly easy for beginners to understand because they're like those used in prose. Difficulties arise when the same part of text is executed multiple times, and/or when there isn't 1-to-1 mapping between the script and behaviour. In one exercise that we give students, we provide the source code of a function to simulate rolling a single dice - int RollDie() - and ask them to write a routine to simulate rolling 2 dice for a board game. Rather than write a function that returns RollDie() + RollDie(), some students create 2 copies of RollDie(), calling them RollDie1() and RollDie2(), then write a function that returns RollDie1() + RollDie2(), so that the conceptual 1-to-1 mapping is preserved. In this case, the fact that real world objects are being simulated may complicate the picture, but using rather more abstract maths proofs as a model introduces other misunderstandings.

In this article I'll consider how some features of C++ hinder the type of mental representation that students are used to. Conceptually, the text of a program is more like assembly instructions for flat-pack furniture than a novel. I'll also point out how some analogies to illustrate how languages work don't help - maths in particular can be a "false friend".

Loops

Small loops aren't too hard to understand - a temporary eddy in essentially linear code

while loops are more linear than for loops. In a while loop the lines that are repeatedly run are contiguous and in order; control takes only one step back.

In a for loop, the locus of control passes through the terminating condition, the body of the code, then back to the last statement, then back again, to the terminating condition before executing the body code again

Functions

As far as locus of control is concerned, simple functions aren't too bad. They're rather like footnotes - you jump to them and jump back again, carrying on from where you left off. Conceptually you can in-line them. Recursion is more complicated - essentially, multiple copies of the recursing code have to be imagined if the one-to-one correspondence between text and action is to be retained.

False friends - maths and time

  • After
    int y=2;
    int x=y*3;
    y=4;
    
    what value has x? People familiar with maths might give the answer 12, because they treat x=y*3 as a symbolic assignment, x being re-evaluated whenever needed.
  • In a maths proof, variables are usually symbolic, and at any time can have any value. In contrast, variables in languages like C++ always have a particular value. In
    int i=0;
    while (i<3) {
      cout << i;
      i++;
    }
    
    the single textual i variable in the expression i<3 has successively the values 0, 1, 2 and 3. The value changes in a way that the value of maths variables don't. The text in a proof is usually processed linearly - a particular i always means the same thing. Exceptions are in "proof by cases" where the reasoning branches (the "4-color problem" was solved using such a proof - a computer program), and "proof by induction".

Discontinuities

The distance between a language and its meaning is emphasised when a small change in the language can greatly change behaviour (and vice versa). C++ has several problems of this nature.

  • int x[12];
    
    creates an array of integers whereas
    x[12];
    
    doesn't create an array. It refers to a single element in an array, one which isn't in the array created using int x[12];.
  • The following loop
    int i=0;
    while  (i<3) {
      cout << i;
      i++;
    }
    
    terminates, whereas the similar
    int i=0;
    while  (i<3); {
      cout << i;
      i++;
    }
    
    runs forever.
  • The lines
    char   c='0';
    int    i=0;
    string s="0";
    
    produce variables that all look exactly the same when printed using cout, though they're not the same at all.
  • The lines
    if (x < 4)
    
    and
    if (x << 4)
    
    do different things. The meaning of "<<" depends on context - here it bitshifts but with cout it does something different. It never means "a lot less than".

Conclusions

  • Introduce students to while loops before for loops.
  • The use of flowcharts might help students who are processing the source code as if it were prose. Alternatively, it might help to use a debugger as a code-animator - see below
  • Code-folding editors are useful - they offer a way to make existing code into a "black-box" once it's stable, so that students don't become distracted by verbose detail.
  • Avoid recursion
  • If the meaning of something depends on context, the students need to be able to identify the limits of that context
  • Be prepared to introduce the idea of idioms. If you're learning English, then analysing the phrase "It is raining" down to the word-level is unhelpful - "what does it refer to?" is a linguistics question. Similarly, breaking down something like
       while(fileInput >> str) {
           cout << str ;  
       }
    
    into its constituent parts can easily be overdone by beginners who've been told to analyse, but haven't been told when to stop - "what is inside an ifstream"? They do need to know that it reads successive words from a file into the string called str until there are no more words left.

Monday, 20 August 2012

Making Programming Easier for Beginners

Introduction

In the days when home computers had BASIC it was common for bright children to write little programs - egg-timers, etc. They gained experience of programming as a concept, but just as importantly they got practise at making mistakes and fixing them.

In the Windows/Mac era this option was no longer available. Some of the older lecturers here say they have noticed a consequent reduction in programming skills amongst our new students.

It's been suggested that the Raspberry Pi might be used as a way to encourage entry-level home programming in the future. And perhaps schools might deliver more programmers to universities. But for now we have to cope with a significant minority of students arriving without programming skills. This leads to student anxiety when they're at their most vulnerable, poor end-of-year results, and perhaps an under-use of computing in their career.

In this document I consider the factors involved with these difficulties. The problem goes beyond being purely cognitive. Some students seem to acquire something like a phobia about the subject (not helped by the public perception of Computing being a male, nerdy, obsessive topic).

Attempted solutions aim to

  • Develop programming skills (or aid the transfer of other skills)
  • Reduce the anxiety levels associated with the topic thus aiding the learning process.
  • Identify the students who'd most gain from extra help once term starts.

What is hard about computing

Computing languages are unlike anything else that students are likely to have done. They're not even like maths. Compilers are unforgiving - always criticising and never praising. Students' programming skills span a wider range than that of any other examined skill, so non-programmers will feel stupid right from the start. Comments similar to the following are common - "It's too hard"; "It's too easy"; "Other people find it easy. Why do I find it so hard?"; "If it mattered we'd have been taught it already"; "arbitrary rules, it's all about jumping through hoops".

It may be that we're trying to teach topics that are known to be difficult but that only a few students will ever need. We teach a compiled language (C++) though few student will use one. Speed of execution is rarely a critical factor in student programs, but a significant factor in our choice of language.

People know which topics students find difficult. It's less clear how to exploit this knowledge. In "When do students learn? Investigating factors in introductory courses" ("JCSC", 2012) it said "we found that instructional time spent on a topic often has a far weaker connection to student learning levels than does instructor emphasis. ... Just spending more classroom teaching time on a concept will not improve student learning as much as an instructor placing greater emphasis on that concept ... For CS1, there were few topics for which there were statistically significant correlations between instructional time and student learning. ... Interestingly, three topics, control structures, subroutines/functions, and types, had weakly negative correlations."

Easing the Learning Curves

One can broadly categorize the problem areas as follows - Algorithms (What to do), Implementation (How to do it) and Psychology (Finding a congenial context within which to learn). Each category has a learning curve that can be made more shallow. Each has been targeted by educationalists

Algorithms

The idea of planning for eventualities that might occur during the planners' absence is not a scenario that students have much experience in. Also programming involves much more failure (error messages) than success, which distresses beginners. Solutions to these difficulties include getting students to

  • begin solely with algorithms - flowcharts, paper exercises involving recipes, etc
  • begin with easier algorithms - "Add 2 numbers"; "Write a times table". These tasks are boring but at least they let students focus on the implementation.

Implementation

  • Re-order the topics (e.g. delay the introduction of functions, Object-Orientation etc.)
  • Expand the documentation at the start of the course - offer cribsheets, etc.
  • Change the language
  • Write tools targeted to help the student with specific, known problem areas (we've written animations and teaching aids - e.g. 'for' loop help
  • Use a new language just for the start of the course - Interpreted languages help accelerate the edit-run cycle, reducing the psychological consequences of errors.
    Some sites use Scratch to narrow the gap between algorithms and implementation - the programs are flowcharts. One paper reported that "not only did Scratch excite students at a critical time (i.e., their first foray into computer science), it also familiarized the inexperienced among the with fundamentals of programming without the distraction of syntax".
    Jonathan D. Blake wrote that "Research has shown that assignments that provide early feedback and early success (and that are compelling and visual) are important in improving not just retention, but also gender equity" ((JCSC 26, 6, p.126). Languages like Turtle Graphics make feedback more visual (especially if they use robots - students can then see the physical consequences of their work).

Psychology

  • Computer-based Teaching - a supportive development environment can be developed (one that attempts to diagnose errors, for example). But this is likely to be language-specific and expensive to produce
  • More Interesting algorithms - If the task is interesting enough, students will be more stubborn when dealing with implementation issues. Students' supposed interest in computer games has been used to incite interest. Lisa Torrey "concluded that the most important interest factors were graphics, usefulness and entertainment value".
  • Working in Teams - This might help some to learning without exposing their ignorance to staff. Some students like to be taught via the language of their peers. However, the results are patchy and over-praised by students. Also, girls get a lot more help than boys.
  • Offer a choice of slow and fast courses, or easy and hard questions, or extra help. Lisa Torrey found that given a choice, "students disproportionately chose to write less challenging programs than their interest patterns had suggested". She felt that "many students would choose easy programs that happen to contain other interesting factors" ("Student interest and choice in programming assignments", JCSC 26, 6)
    Extra help can be made available
    • Making help available by phone or e-mail, or having a drop-in surgery
    • Offering 1-to-1 tuition
    • Online self-help (or supervised) groups (could offer marks for the quality of involvement; could use Skype).

See Also

Tuesday, 14 August 2012

Using Scratch to prepare students for programming

Scratch, first launched in 2005, is considered one of the best languages for introducing children to programming. Some universities are beginning to introduce Scratch into introductory courses.

Scratch

Scratch programs are flowcharts that the programmer constructs by dragging blocks around the screen. The blocks are shaped so that they click together if the "syntax" is correct. It lacks methods (functions), so it doesn't use parameters or return values, but it does have Events and Threads, both of which are important in modern computing.

Extensions to Scratch exist that add functions, etc - see for example Build Your Own Blocks. Scratch 2.0 is beginning to be used - it has procedures, webcam support, Lego support and even support for cloud programming.

Harvard

Harvard tried Scratch in a "Harvard Summer School's Computer Science S-1: Great Ideas in Computer Science", the summer-time version of an introductory course at Harvard College. Details are at http://cs.harvard.edu/malan/publications/fp079-malan.pdf. They used Scratch to introduce the idea of programming and set some exercises before moving on to Java - "Though some students spent only 2 or 3 hours on their first problem set, others spent upwards of 20, implementing projects more advanced than any of those written in lecture." One student wrote "Though I did not yet know how to create a for loop, I knew when a for loop was necessary because I had used loops in my Scratch program."

Encouraged by this, the staff introduced Scratch on the full undergraduate course. CS50 is Harvard College's introductory course for majors and non-majors. Typical enrollment is 300 students. Scratch is used in the first two lectures and the first assignment; the rest of the course is taught in C, PHP, and JavaScript. Details are on http://infoscratch.media.mit.edu/SIGCSE2010Workshop

Documention and initial Scratch code is available. See SCRATCH for Budding Computer Scientists (a tutorial).

Berkeley

In 2009 Berkerley trialed a course based on Scratch which they later introduced in an alternative introductory computing course. http://www.eecs.berkeley.edu/news/cso.pdf describes the reasoning behind the course changes, noting that

  • Scratch supports some advanced (Web 2.0) ideas. It allows students to upload their finished graphical programs to the web which can then be run online in a web browser, downloaded, modified (or, "re-mixed") and re-uploaded.
  • Scratch encourages broader participation - the report gives statistics on female and hispanic participation. They write that "We have a longstanding goal to provide alternative paths to prepare students for CS61a. The traditional path to CS61a, of taking the AP computer science test, suffers from little participation by populations that are typically under-represented in computer science."

Wisconsin

Their notes are at http://pages.cs.wisc.edu/~dusseau/Classes/CS202-F11/. The weekly exercises for this course are online - e.g.

  • week 1 - exploring the Scratch website, playing a game and answering a survey
  • week 2 - "this homework has two parts. In Part A, you'll use Scratch to draw an interesting picture. In Part B, you'll analyze different scripts written in Scratch and decide if they have the same functionality or not."

This course develops the social networking aspect. One of the first tasks the students are asked to do is upload a photo of themselves.

Rutgers

Their Programming for the Masses course uses Scratch. Though it's not a programming course, students do learn to write short programs. One development of this is the Scratchable Devices project where students can program their household devices. Using devices are equipped with an XBee module connected to Arduino microcontrollers, they can switch lights off and on by clapping, etc.

Ohio State University

In an Outreach course they offer some partly working scratch files ("Save the Turtle", etc) and invite the student to modify them. Their document is worth reading for the exercise - http://www.cse.ohio-state.edu/~paolo/outreach/ScratchSE/LabOverview.doc

Kent State University

Their introduction to computer science uses Scratch then JavaScript. Few details are online - http://www.cs.kent.edu/~volkert/10051/

Conclusions

Pros

  • Reputable universities have already done a lot of the work that we'd need to do. Proposals for course-changes, Scratch tutorials and exercises for students are all online.
  • Some students have an impoverished mental representation of programs - they don't "chunk". Scratch programs match my internal representation of simple programs - objects are nearly independent and have dynamic internal structure.
  • Though not many Universities are officially using Scratch, the more advanced school and OutReach exercises offer sufficient challenges for students-to-be. See for example NeboMusic Polygon Robot exercise

Cons

  • Scratch leads more naturally to Object-orientation, a trend that some universities have been distancing themselves from. Moreover, it segues poorly into initial C++ exercises like "get the user to type 2 numbers in. Display the sum" or "print the 5 times table"
  • Scratch bypasses most of the language features that C++ students find most difficult.
  • Quite a lot of the excitement of using Scratch is the social-network aspect, but this needs some management

Wednesday, 13 July 2011

C++ function problems

This page shows some of the students' imaginative attempts at solving one part of the 1st year Michaelmas computing course. The task is one that's been on the course for years, but this year we spoonfed them less. The students would have the same difficulties in many other languages, though C++ provides rather more challenges than Matlab (for example) would.

How we introduce functions

We explain functions gradually, using a mixture of explanation and practical work. The explanations include animations, diagrams showing functions as boxes with inputs and outputs, and sections entitled

We tell then that they have to do 3 things when writing functions: write a prototype, write the function, and call the function. The practical work begins with a gentle learning curve as follows

  • We show them is_even (a function that determines whether an integer is even) and give them a program that uses the function to see which integers in the range 1 to 10 are even. Then we ask them to produce a similar program to identify multiples of 3. This requires them to modify the code trivially, but we insist that they change the name of the routine (so they have to change the call and prototype too).
  • We get them to write a times-table program using a timesby7 function that we provide (the program will be similar to the above program)
  • We get them to write a program with just a main function, then get them to restructure it without changing its output so that it has main and another function.
  • Then we get them to write bigger programs with functions written from scratch
  • Then we get them to use library functions.
Common problems include
  • Thinking that the prototype
         int fun(int number);
    
    means that they have to call fun with a parameter called "number". We could get them to write the prototype as
         int fun(int);
    
    but that's not considered good style
  • Thinking that the prototype
         
         int fun(int number);
    
    calls the routine. We could ask them to prove to us that it does - by adding a cout call to the function.
  • Thinking that if a function prints out the answer, that's the same as returning it.

Playing with dice

About half way through their Michaelmas term work when they've already used functions we give the 1st years the following code

int RollDie()
{
   int randomNumber, die;

   randomNumber = random();
   die = 1 + (randomNumber % 6);
   return die;
}
We say

Each time the function random() is called, it will return a random positive integer. Work out what the ... function does and how it works.


One student didn't know how to search for "%" on a web-page, hence couldn't find where we'd described what the "%" operator did. I worry about students' webskills sometimes.

Then later in the handout we say

You've already seen the RollDie function that simulates the rolling of a single die. Copy it into your new file. Now write a function called Roll2Dice to simulate the rolling of 2 dice (call RollDie twice and return the sum of the answers). Before going any further, test it. If it doesn't work, neither will your full program! Here's a main function you could use

  int main() {
     srandom(time(0));
     cout << "Roll2Dice returns "  << Roll2Dice()  << endl;
  }
You'll need to add prototypes for RollDie and Roll2Dice too.


This task contrasts with last year's work where after we gave them the code for RollDie() we gave them the code for a routine with the prototype int RollManyDice(int M) (though we didn't provide the prototype or the final return ... line of the function). We made the change because we'd rather students programmed something simple themselves than merely type in more complex code than they don't understand.

Here's a list of solutions that students have tried

  1. Several start by writing this prototype
       bool Roll2Dice()
    
    because the first function introduced to them returns a bool.
  2. A few start by writing this prototype
       int Roll2Dice(int RollDie(),int RollDie() )
    
    because RollDie is "needed" by Roll2Dice, I presume.
  3. Some do this
    int Roll2Dice()
    {
       int randomNumber, die;
    
       randomNumber = random();
       die = 1 + (randomNumber % 12);
       return die;
    }
    
    (returning an integer from 1 to 12) or this
    int Roll2Dice()
    {
       int randomNumber, die;
    
       randomNumber = random();
       die = 2 + (randomNumber % 11);
       return die;
    }
    
    (returning an integer in the range 2 to 12, all the outcomes equally likely) or this
    int Roll2Dice()
    {
       int randomNumber, die;
    
       randomNumber = random();
       die = 1 + (randomNumber % 6);
       return 2*die;
    }
    
    (i.e. rolling a die and doubling the outcome). I think these examples illustrate that common sense suffers when students are struggling with C++.
  4. Quite a few people start by writing a new function to simulate the 2nd die.
    int RollDie2()
    {
       int randomNumber2, die2;
    
       randomNumber2 = random();
       die2 = 1 + (randomNumber2 % 6);
       return die2;
    }
    

    Some then try doing

      int total=die+die2;
    
    later in their program rather than calling the functions, not realising that die (in RollDie) and die2 (in RollDie2) are unavailable. At this point some students create global variables die and die2 while still creating the local instances of die and die2 - which silences the compiler but isn't the correct solution.

    Others write a Roll2Dice() function that calls RollDie() and RollDie2() to get the correct answer. Perhaps the existence of 2 dice makes them think they need 2 functions - I suspect they wouldn't write 2 functions to calculate the square roots of 2 numbers, or write 10 functions to roll 10 dice.

  5. The next is one of the most common solution, not calling the provided RollDie function at all.
    int Roll2Dice()
    {
       int randomNumber, die, randomNumber2, die2;
    
       randomNumber = random();
       die = 1 + (randomNumber % 6);
       randomNumber2 = random();
       die2 = 1 + (randomNumber2 % 6);
    
       return die+die2;
    }
    

Conclusions

I was hoping for
   int Roll2Dice() {
   int die1=RollDie();
   int die2=RollDie();
   int sum=die1+die2;
   return sum;
}
or even just
  int Roll2Dice() {
  return RollDie() + RollDie();
}

It's easy enough in a handout to explain how to write correct code, but this year we didn't want to tell them exactly what to type. Just about everything that we didn't dictate to them produced errors that revealed a lack of understanding. I think it would be counter-productive to anticipate and correct these misunderstandings by putting a list of what not to do in the handout - it would confuse them. Besides, it's useful to have these conceptual errors exposed as early as possible as long as demonstrator help is available.

Some of the solutions above are correct and the students often understand what they've written, so there's a case for letting them get on with it, but they're going to have bigger problems later if these conceptual hurdles aren't tackled now. (I once looked at a IIB project student's final program. It barely used functions. By factorising repeated code I reduced the line-count to 30% of the original. Worrying).

Some students are clearly just guessing as they go along, looking for any lines of code that look as if it should be copied. It would help if they revised earlier work, or trace their finger along the locus of control, explaining it line-by-line. Others start with a reasonable idea of what to do but make small mistakes that lead to bigger ones as they try to silence the compiler at all costs. It would help if they could identify run-time errors as soon as possible, but iterative development is something they only slowly learn, and besides, not all of them know what results to expect.

Understanding functions remains a problem. We introduce functions by analogy with mathematical functions, but in C++ they can see inside the black-box that is the function, and once they do, they find it hard to treat the function like a black-box ever again (it becomes a physical thing occupying space, rather than a concept). As an educational aid it helps to have an editor that collapses functions.

Frequencies

We get the students to run routines like Roll2Dice() and record the outcomes. They find

   int outcome=Roll2Dice();
   frequency[outcome]=frequency[outcome]+1;

hard to understand, which isn't so surprising given that after 6 hours of practicals

  • a few students still don't know how to add 1 to a simple variable.
  • more than a few students have "no idea" how to write a line that "creates an array called frequency big enough to store 6 integers". I left one such student to read the documentation for a few minutes, but when I returned to him he was none the wiser. The Arrays section of the doc might be sub-optimal, but it can't be that bad - it's much the same as last year's.
Even those who do understand arrays have trouble with the code quoted above, though they're happy with
   int outcome=Roll2Dice();
   if(outcome==1)
      frequency[1]=frequency[1]+1;
   if(outcome==2)
      frequency[2]=frequency[2]+1;
   ...
I've tried to spell out a 2-page explanation of the shorter version as a Frequently Asked Question but some people don't understand that either. When the penny drops they sometimes remark "that's clever". Then they try to be too clever and do
    frequency[Roll2Dice()]=frequency[Roll2Dice()]+1;

wondering why it fails (they're calling Roll2Dice twice, so the RHS and LHS may refer to different array elements). Having fixed it they put the line in a loop. A few students do this

int tries=0;
while(tries<100) {
   int outcome=Roll2Dice();
   frequency[outcome]=frequency[outcome]+1;
   frequency[outcome]=0;
   tries=tries+1;
}

Why is frequency[outcome]=0; there? Well, one student said that it was in an earlier loop so they thought they'd better put it in this loop too.

In short, there are still many indications that a non-trivial minority of students are just fumbling blindly through. If anything, the changes to the course this year make it easier for demonstrators to identify the students with severe problems - it's harder for students to bluff their way through.

According to "Validating an instructor rating scale for the difficulty of CS1 test items in C++" (Lulis and Freedman, JCSC 27, 2 (December 2011)) "faculty members disagree amongst themselves as to the difficulty level of questions involving functions", much more so than for questions involving other topics.

Thursday, 2 June 2011

Online help systems

On 29 June 2011 I'm attending a UCISA symposium on "Advisory and IT Support" about "Producing a service desk good practice guide - Measuring the service desk". A parallel session that I'd also have liked to attend was from University of Lincoln - "A project, led by ICT, focused on changing the ICT and Estates service by embedding a new culture across both departments that delivers excellent, consistent service, underpinned by a robust framework of technology, processes, learning, development and support". As preparation I thought I'd summarise our experiences.

Changes in user skills and expectations regarding the WWW offer challenges and opportunities for information providers, but it's far from being merely a technological issue that can be solved by a new piece of software. As Lincoln has discovered, culture (amongst staff as well as amongst users) and processes matter, and need to be understood. MIT have produced a report on their attempt to update their system - see the Nercomp Hermes presentation.pdf. They say that a culture of "Knowledge Centered Support" involves developing "knowledge as part of problem-solving process - When you solve a user’s problem, document it". Inhibitions to this included

  • Wiki markup (or browser support for WYSIWYG editors)
  • Time pressure (for frontline support, primarily)
  • Different self-imposed “standards” for publication
  • Unease with public information, even if there’s nothing inherently confidential
  • "Ownership": if I write about it, I might have to support it

Their list of "Lessons Learned" included

  • "It's harder to get contributors than we thought"
  • "Did not set up a tracking mechanism up front. Can't tell who's looking at what."
  • "Get buy-in from decision makers to make "executive" decisions setting expectations for internal IS&T groups to contribute information"
  • "Need to be clear about what information goes where, e.g. website versus knowledge base"
  • "Ongoing maintenance is required to keep content fresh."
  • "There's a need for an advocacy role"

Their project had definite end date and finite resources and a special one-time allocation of funds, which helped to force things forward.

Many of these finding chime with my observations - see the Searching, Culture, Distributed Authorship, and Solutions sections below. My department has had an online help system since 1994, several authors having produced pages. Our oldest page was last modified at "1995-01-06 12:41:20 GMT". A 1997 version of our front page is still online

Oct 1997 (traffic 33k pages/day; click to zoom)
Aug 2010 (traffic 201k pages/day; click to zoom)

Apart from the house styling, you'll see that little has changed on the surface - even in 2010 it was still described as the "hypermedia help system"! In 1996 I wrote a little review of the help system where I mentioned that

  • "We are encouraging (not very successfully) admin and teaching staff to maintain their own material."
  • "We have regular users of our system who still don't use e-mail let alone the help system, so personalised user education is still necessary."
  • "[people are] Over-using brute-force searches"

We had a help-search facility but I don't know what it was - Google hadn't really take off by then. Maybe we already used swish, an earlier version of swish-e a public-domain indexing facility.

Keyword searching wasn't the only option for users - we offered "task-based" and "subject-based" trees of links, and pages had at their foot a list of related pages so that users could browse around. In those days there were several sites (e.g. Yahoo) that maintained a hierarchy of links to pages so that people could browse as an alternative to word-searching when they wanted to find something out. Even in 1996 however, people preferred brute-force searches though their search terms were often more hopeful than precise.

Between 1996 and 2003 the amount of material grew, as did the variety of types (PHP, movies and databases appeared). Our dependance in the help system as a front-line service grew too - in our introductory letter to new undergraduates we wrote "The department has an extensive help system ... which has answers to many questions people ask about the Engineering Department computer system. Please look at this first and if you cannot find an answer there consult the Department's Computer Operators". The success of Google meant that more than ever, users word-searched for information rather than browsed through hierarchies. Google ranks pages in a way that satisfied most users. Customised site-specific searches can be set up using google, but there are difficulties using Google to look for local information because some of it was domain and/or password protected.

The growth in local material hadn't matched growth globally. Many of the documents we wrote in the early years had been superceded by documents elsewhere. Because of the increased performance of the internet there was not even a speed advantage to having local documentation.

Searching

By 2003 it was clear that university establishments might have special requirements when it comes to searching. The Search facilties for UK HE web sites paper (written by a Cambridge webmaster) dates back to 2003 and lists some useful alternatives.

Troubles that people searching our site have are that

  • It can be hard to think up useful search terms for questions involving a lack of understanding rather than a simple lack of information
  • Many queries involve generic computing terms (e.g. "open", "windows", "word") which makes ranking more difficult.
  • Coverage will be patchy with some common questions not covered while other obscure topics might be covered in a depth that swamps the results of searches. If a user fails to find information in their first help-search (which is likely) they'll be wary of using it again. At least Google comes up with something even if it's not locally relevant.

A 2007 consultant's report about the University site noted that local searches still posed problems - "Most of the [users'] complaints about the site fell into three areas:" the first-mentioned being "inadequate search facility: It was generally felt that an external google search yielded more appropriate and better presented results than the search function on the existing University website". The report went on to say that "As a minimum Google search should be implemented across site content .... Adoption of a University-wide meta-tagging is a prerequisite and a major editorial undertaking that should be done as part of the initial content rework". However, this recommendation seems not to have been adopted by the University

Culture

The department's help system has a rather elite target audience who are science-literate, but can be naive computer-wise. Amongst the opinions about the Help System are these -

  • That it in some sense belongs to the Computer staff (it does)
  • That it in some sense belongs to a small subset of Computer Officers (it doesn't. Any CO or operator can contribute)
  • That pages have to be written in HTML using the current house style (they should. Example pages are provided)
  • That authors will get mailed if a page is wrong (they will if links go bad, or if mistakes are noticed in a popular page)
  • That it's old-fashioned static HTML (most of it is static HTML)
  • That material is hard to find

Some of these beliefs inhibit page production

  • If a person thinks that they're not allowed to write pages, they might add them to their research group's web site or to their personal pages)
  • An author might rather not write a page at all than have to maintain pages long-term.
  • If the help system aims to replace work done by people, those people will be out of a job (or at least will have to do less pleasant work)

Some of these beliefs inhibit users trying the help system

  • For the 50% of students who use Facebook at least once a day, the help system will look old
  • The un-Googly search looks unfriendly

Though the skills that web users employ to further their hobbies aren't always used in their academic work, the gap between the help system and other information systems has widened recently.

Distributed Authorship

From the start, pages in the help system were owned and looked after by many people, though a small number of people write most of the pages. Initially a few central pages were owned by webadmin and the rest were in folders owned by individuals, making for easy management and identification of ownership. Each page mentioned its author, so bug reports could easily be directed to the right person, and (except for the top level) folders didn't contain files with a variety of owners.

There are disadvantages to this (e.g. when people leave, their pages need to be moved) but when we tried having more central pages authored by a role rather than an individual, mail to that role-name was left unanswered. It can take over a year for an incorrect sentence to be removed from a page, even with reminders.

Multiple authorship introduces other problems too - the standard of the HTML varies widely, and also when an author produces a new page they need to tell other authors to link to it.

Solutions

In 2009 a student created a pilot system based on Wordpress blog software, hoping to leave behind some of the above-mentioned beliefs. It shares many design ideas with MIT, though we were unaware of MIT's plans at the time.

  • Comments can be added by anyone to pages
  • A WYSIWYG editor and form-based input means fewer errors and easier authoring
  • Pages can be drafted so that someone else can authorise them.
  • There's more automated page- and link-checking
  • It's a blog, and blogs aren't old fashioned - they're fun (User 2.0).
  • When a new page is created it appears immediately in other pages' "related links" lists
  • Authors can add a comment to other pages, mentioning their new page. Better still, each page has an auto-generated "Related Pages" section at the end that lists pages with related tags.
  • a Wiki-style option is possible, letting authors reversably change other authors' pages
  • A WYSIWYG editor will guarantee more consistent (but not necessarily better) HTML
  • The system automatically records authorship
  • line managers can list all the pages written by particular authors, along with modification dates.

"The Corporate Blogging Book", Debbie Weil (Piatkus, 2006) looks at issues relating to the introduction of blogs into an e-mail-literate workplace. It mentions inhibitions

  • If bosses don't blog, why should the employees?
  • Some users and management think that time will be wasted (it will, if the resulting pages aren't used and advertised by staff)
  • People who are confident enough writers to post e-mail have doubts about producing web pages (because of larger audience, and uncertainty about etiquette)

The book also mentions advantages, some of which haven't yet been mentioned

  • RSS feeds help reduce bulking mailing
  • Less distance between "us and them" - students and staff

For more details, see the student's Final Report

Meanwhile, in October 2010 we gave the old material a new front page

Oct 2010 (click to zoom)