[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TULK



Hello,

Udi Finkelstein wrote:
> However, I find that most of the time I get the information I need 
> from dynamic resources such as public usenet newsgroups, 
> company-specific NNTP based newsgroups  (microsoft.*, intel.*, 
> staroffice.*, etc.), and mailing list.
>
> Some of the information I'm looking for is not answered by "standard"
> documents because:
> 1. They are updated much slower.
> 2. They don't cover my particular configuration. (Software X doesn't 
> work with hardware Y - See the latest StarOffice 5.0 crashes with S3 
> based cards issue).
>
> This type of information will always be answered by internet-based 
> forums where there will always be at least one more user (out of the 
> millions using Linux) with the same type of problems as you have, even 
> if it occures only with a peculiar hardware/software combination.


Discussion systems such as newsgroups, mailing lists and bulletin boards
are a great thing. After all, brains are by far the best storage and
retrieval facility, especially /other/ people's brains...

However, I don't believe posting to discussion systems are a general
solution. To
list a few problems:
1. Latency
   It usually takes a few hours, sometimes even days, to get a
   relevant answer. Not satisfied? Another round will take you
   the same time.
2. Scalability
   Discussion systems can handle only that many active users before    
   being overwhelmed by sheer quantity. One solution is fracturing, 
   as in the numerous Linux UG mailing lists, but I don't think it's
   generally applicable (yet; this is certainly an area for interesting
   research). The more popular solution, though, is FAQ pages and cries 
   of "RTFM!", which bring us back to static documentation.
3. Global Efficiency
   The total time invested in your typical newsgroup is quite 
   staggering. The asker of a question may invest little effort, but
   every question asked and every post following it get scanned by 
   numerous people, most of won't find it interesting. This /has/
   to be the case to achieve the "at least one user out of millions"
   effect you described.
4. Access
   You must be on-line to ask your question, and to check for answers. 
   This will not become trivially dismissable in the near future.

Again, I'm not saying discussion systems are without use. They're
irreplacable for help in addressing context-specific problems ("Help! My
program doesn't compile!"), when available documentation has been
exhausted, or when discussion is to take place and opinions formed. But
they are not a long-term replacement for a knowlegebase.


All of the above problems are solved, however, if you treat newsgroups
and their ilk as static resource, by searching the archives instead of
posting a question. This use of discussion systems has totally different
characteristics. There are some very strong properties: high update
rate, sheer quantity, and perhaps most importantly a focus on users'
needs, since what gets discussed is precisely what people are interested
in.

Alas, discussion archives too are not a replacement for a well-edited,
"authoritative" (in the sense I used in TULK) database. To list a few
drawbacks:
1. Signal-to-noise ratio is usually poor, and often horrendous.
2. Related to the above, discussion posts are significantly less
   thought out than non-volatile articles. Less effort is invested,
   and this certainly shows in depth and accuracy.
3. When you reach an post, it often doesn't contain enough context
   to be easily understandable. It sometimes take a lot of quotation 
   reading, or even reverting to previous posts on the chain, just to
   realize its irrelevant to your needs.
4. In paractice, there are some serious problems with searching 
   discussion archives. Usenet is fine, but if you want to search 
   mailing lists or discussion boards, you must learn of their existence 
   first, and search each separately. This problem is especially 
   significant when conducting unfocused searches, orfor new users. 
   In fact, many mailing lists don't even have search facilities!
5. Discussion board postings usually contain few outside links,
   such as canonical or related references. This makes it hard to
   get context, or to navigate around in problem-space when someone's
   /almost/ talking about your problem.

To my view, raw discussion archives are only a fallback strategy if you
don't find your answer elsewhere. They /may/ contain your answer, but
the effort in finding it is much higher. 

Discussion sytems do raise an interesting possibility, though:
Harvesting useful information, editing it into self-contained texts and
adding them to the knowledgebase. Imagine keen newsgroup harvesters
intent on their work (or just the occasional mailing list reader)
finding an informative post, providing some context to replace the
quotations, and promptly submitting the result. 

If you recall, the conclusion of my article
(http://www.forum2.org/eran/tulk/) envisions building a system that
makes bazaar-style documentation possible. Discussion harvesing could be
a significant factor in this.


> Besides, a quick-n-dirty solution is still a keyword based full text 
> search engine, indexing all your main-pages, HOWTO's, and /usr/doc/* 
> can be built much quicker than the long term solution you suggest.

Granted, it's much quicker to achieve, and probably worth doing as a
stopgap measure. But:
* Such a search often yields dozens of irrelevant entries when searching
for some simple command. This will get increasingly worse.
* man pages are usually at the quick-reference level. To learn something
in the first place you need to either go buy a book, or learn by trial
and error (which /we/ may prefer, but most people don't).
* The scope of man pages is mostly limited to things after which you
press Enter or put a ";". To get even something as elementary as an RFC,
you need to on-line.
* HOWTOs are great; there should be more of them; the LDP already uses
SGML.
* Non-textual output, inter/intra-document links, annotations, logical
hierarchy... See section 2 of my article.


BTW, can anyone tell me what free documentation exists for X, GNOME/GTK
and KDE/Qt, and in what formats? I've never developed for these.

  Regards,
    Eran Tromer