FROM: Geoffrey Weatherall University of Waikato Hamilton NEW ZEALAND RE: Some VMS News Enhancements I have made some enhancements to ANU News. They are: 1. A quicker startup and smaller NEWSRC. file. 2. A menu user interface which operates along side the command line system. 3. Error message stripping to make the message more suitable for new or infrequent users. These three enhancements can be installed independently into the NEWS source code. The quicker startup on our system (carrying about 900 newsgroups) reduces the startup time from 30-40 seconds to 5-7 seconds. It also reduces the size of the NEWSRC. file from 70 blocks to 2 blocks (the size of the reduction depends upon the number of newsgroups). The menu system provides a "Microsoft Word" type menu. The command line remains fully operational when the menu system is installed. At the NEWS> prompt the user can press the ESC key and a menu appears on the bottom two lines of the screen. The error message stripper removes the string "%CLIxxxxxxxxxxxxx," from the front of messages from the command line interpreter (this information being of little or no value to most users). It also removes the string "Error:" from the front of other messages. Bill Fenner NNTP_TCPUCXM.C : NNTP_SERVER now expects read_net() to return the number of bytes read. Modified queueing structure to allow for this. NEWSADD.C : Changed log file format to add hostname we received the article from *** Note: this breaks FEEDCHECK.C, but it was already broken for NNTP folks NEWSDIST.C : Changed scansys() to initialize sys_article_count, sys_article_bytes, and sys_id_count to 0. Changed sys_remote_send() to actually count sys_article_count and sys_article_bytes. NEWSSITE.C : #defined ORIGINAL_no_priv, since even though 5.3-1 defines JPI$_RIGHTSLIST, it doesn't work. NEWSCONTROL.C : Added support for descriptions in newgroup control messages of the form: For your newsgroups file: news.group.name Description Completley blank lines are allowed between the two lines, but no whitespace at all. David Lawrence's messages never have a blank line, but other's sometimes do. NEWSFILES.C : Saved the RMS error when opening a file for below NEWSUTILITY.C : Added more description to error messages NNTP_SERVER.C : Added more description to error messages Date: 10 Aug 92 21:12:00 CST From: "FLOWERS HARRY" Okay, here's a patch to V6.1a4 NEWSRC.C to change the behavior; now, it will not make any .OLD files, but, if it encounters an error when updating the NEWSRC file, it will retain the old one and issue an error for the user. The status given will be "1" if the error was encountered when writing the file, and "-1" if the error occured in flushing the buffers on the close. I've tested it, and I haven't been able to get NEWS to leave a corrupted NEWSRC file with this fix. Hopefully, this will make it into the V6.1 distribution. If you have any problems with this fix, please let me know. Date: Tue, 11 Aug 1992 16:06:55 GMT Reply-To: system@FALCON.NAVSSES.NAVY.MIL Sender: ANU-NEWS Discussion From: system@FALCON.NAVSSES.NAVY.MIL Subject: Bug in ANU NEWS NNTP server for TCPWARE To: Multiple recipients of list ANU-NEWS Content-Length: 4202 The NNTP server code in ANU-NEWS V6.1 Beta 1 is broken when it is used with Process software's TCPware for VMS product. With the help of Bernie Volz (Volz@process.com) from Process software I was able to get a version that works with the version of TCPWARE for VMS that is running on NAVSSES' VAXes. The fix required that changes be made to only one file, NNTP_TCPWINMULTINET.C. Below are the differences between the V6.1 Beta 1 (NNTP_TCPWINMULTINET.ORG) and my hacked version (NNTP_TCPWINMULTINET.C). -- Mike Jacobi NAVSSES VAXCluster system manager System@Eagle.Navsses.Navy.Mil System@Falcon.Navsses.Navy.Mil Date: Tue, 11 Aug 92 13:41:32 PDT From: "Mark Pizzolato 415-369-9366" Subject: Re: A beta version available (finally!) To: gih900@aarnet.edu.au Attached are my latest changes after updating to V6.1-B1. Please change the NEWS_VDATE stuff in NEWSSITE.H to reflect the new version! The changes to feedcheck reflect necessary changes due to the log file format change introduced by Bill Fenner's modification of the "Add " log lines. No big deal. And the latest updates for other reasons to this program. NEWSDIST had several minor bugs that I had recently fixed and not updated you, I've merged these into the new stuff. NEWSFORWARD has a minor enhancement to provide the default subject line as "Re: subject" if the original subject didn't have "Re:" to begin with. This now behaves as followup does. NEWSRTL has a new routine which is only currently used by read_config (i.e. when reading NEWS.SYS). This routine removes imbedded comment lines and does NOT allow comment lines to be continued if they end with a '\'. I got real confused a while back when I commented out a line in my NEWS.SYS file: infopiz:all,world,inet,news,vmsnet,vmsnetwg,net,to,comp,sci,rec,gnu,news,\ #misc,soc,talk,usa,ba,ca,na,\ biz,bionet,inet,ddn,uc,ucb,la,unix-pc,su,alt,local,junk,control,general,net:: -- Mark Pizzolato - INFO COMM Computer Consulting, Redwood City, Ca PHONE: (415)369-9366 UUCP: decwrl!infopiz!mark or uunet!lupine!infopiz!mark DOMAIN: mark@infocomm.com Date: 11 Aug 1992 14:41:50 -0500 (CDT) From: Bob Sloane Subject: Re: A beta version available (finally!) To: G.Huston@aarnet.edu.au Here are some patches I have put in V6.1a4 that you might want to add to your local changes. Most of them are very minor. There is a new version of CACHE.C and changes to NNTP_SERVER.C and NEWS_BUILD.COM to add the NNTP cache stuff in. I have been using it for quite a while now with no problems. ============================================================================== In NEWSADD.C, whitespace is accepted in header keywords. ============================================================================= In newspost.c, there are references to undeclared variables if the conditional compilation for spawning mail is used. ============================================================================== In NEWSSETSHOW.C, when changing a group from restricted to not restricted, the access file is not deleted. =============================================================================== The following changes are for nntp caching. =============================================================================== Here are some changes to the nntp cache build procedure to move the object files to NEWS_BUILD. ============================================================================= Here are some changes to NEWS_BUILD.COM that link in the NNTP cache code. =============================================================================== Here is the current version of CACHE.C. This version seems to work. -- USmail: Bob Sloane, University of Kansas Computer Center, Lawrence, KS, 66045 E-mail: sloane@kuhub.cc.ukans.edu, sloane@ukanvax.bitnet, AT&T: (913)864-0444 The following changes have been made by Mark Pizzolato (mark@infocomm.com); NEWSDEFINE.H - Added sys_statb & sys_fp to the sys_entry structure. - Added include of stdio.h - Added sys_article_count, sys_article_bytes, sys_id_count to sys_entry structure. NEWSEXTERN.H - Added definition of flush_downstream in NEWSDIST - Added definition of set_mail_self_flag in NEWSFORWARD - Added new entry points for routines in NEWSLOCK ADD_TRANSFORM.C - Fixed improper conditional compile "#ifdef UWO" to "#if UWO" NEWGRPFILE.C - Removed generally redundant lock code that now exists in NEWSLOCK.C since it is called from multiple places. - Added code to call new generic routine. NEWITMFILE.C - Removed generally redundant lock code that now exists in NEWSLOCK.C since it is called from multiple places. - Added code to call new generic routine. NEWSFORWARD.C - Added functionality to determine the default state of the /SELF qualifier for the FORWARD and REPLY commands, from the user's mail profile information. NEWS.HLP - Updated to reflect the /SELF functionality for the FORWARD and REPLY commands. NEWSCMD.CLD - Updated to reflect the /SELF functionality for the FORWARD and REPLY commands. NEWSDIST.C - Reworked sys_remote_send to minimize the number of disk I/O operations that are performed. a) keep downstream batch files open across calls (to minimize the number of file opens and closes buffer flushes), and b) avoid opening and reading each outbound file multiple times by buffering the item file's contents in a memory buffer (saving an extra copy of the item from being written to SYS$SCRATCH). Both of these changes trade off virtual memory resources for performance. This trade off is acceptable since it really only affects the NEWS MANAGER process which is being used for ADD processing, and does not accect the general user. Keeping multiple downstream batch files open can be disabled by defining CLOSE_BATCHFILES. - Added new routine flush_downstream, to close the downstream batch files that are kept open above, this routine now also reports what (articles, article ID's and batch size) has been passed downstream. NEWSDELETE.C - Updated to flush downstream batch files after a cancel command. - Updated to return number of history records deleted in hist_skim NNTP_SERVER.C - Updated to flush downstream batch files after a remote post. NEWSADD.C - Added useful message in log file when problems occur adding control messages. - Added reporting of input batch size when announcing the batch file name. - Fixed calls to fgets to avoid possible buffer overruns. - Fixed bug dealing with input batch parsing when the input file has been extracted from Mail and contains more than one message. - Timing test of various VAXCRTL I/O routines vs. file formats reveals the following: 1) RMS - Multi Buffer Count (Process or System) is not used when dealing with STREAM_LF files, so explicitly specifying a value for "mbc=" when opening a file is desired for more efficient file I/O. This efficiency is realized by: a) lower I/O counts b) faster I/O throughput c) lower CPU time. 2) The VAX CRTL is MUCH more efficient at reading files which are STREAM_LF in format. For example, large file (1 mb) read with calls to fgets, with the file opened with a mbc=120, took 1.67 CPU seconds when the file as STREAM_LF and 6.64 seconds when it was Variable. Writing the same file with the same buffering, consumed 8.96 CPU seconds when the file was variable vs. 1.96 CPU seconds when the file was STREAM_LF. The general message is clear. Stream_LF file format is preferred. Well, I questioned this conclusion, and compared the VAXC RTL I/O vs. direct calls to RMS $GET for the same files. With a large MBC, RMS could read the variable file in 3.82 CPU seconds vs the 6.64 for VAXC RTL. The Stream_LF file was read by RMS in 9.68 CPU seconds verses 1.67. Writing a variable format file with RMS took 5.81 CPU seconds vs. the 8.96 for VAXCRTL. Writing the Stream_LF file took 6.36 CPU seconds with RMS, vs the 1.96 CPU seconds for the VAXC RTL. NOW, the real conclusion should be: depending on how you are going to read (most frequently), or write the data (VAXC RTL (fgets, fputs)), or direct RMS calls (SYS$GET, SYS$PUT) will determine the preferred file format. VAXC RTL implies STREAM_LF, and RMS implies VARIABLE/CR. The changes I've implemented (in create_article), actually creates item files with Stream_LF format, since a cursory observation showed that things use do_open_item to open an article followed by fgets to read it. I have also explicitly specified values for mbc here in NEWSADD and elsewhere, to help realize the performance gains noted above. I may have been too extravagant with my choices ov the mbc values in each case, but I have selected values which seem appropriate to let the runtime environment either read ahead (or write behind), enough so that most files are read in one or two I/O's. NEWSPOST.C - Updated to flush downstream batch files after a local post. - Changed all file opens to specify a mbc value to dramatically reduce the number if I/O's and increas the resulting throughput. NEWSBUILD.COM - Simple fix to allow Control-Y to abort compiling - Added code segment to compile new program NEWSSHUTDOWN - Changed build of NEWITMFILE, NEWGRPFILE to link against NEWSLOCK NEWSSKIM.C - Updated to report number of history records deleted by hist_skim 26-Feb-1992: NEWSCONTROL.C - determine the "approval" status of a control message (checkgroups, newgroup, rmgroup) from the global variable itm_approved which is set to include the user specified /ACCEPT qualifier which allows overriding of missing "Approved:" header lines. NEWSADD.C - use the value returned by parse_control_item (it's parse status), and in the event of a failure, report a message indicating the failure reason. In the past, many failed control message postings were silently accepted to the control newsgroup, and the control action was NOT taken, and no indication that it wasn't was observed. If someone was interested in why a control message was seemingly ignored in this way, (i.e. the missing "Approved:" header), they had to dig through things with a debug version of NEWS. 4-Apr-1992: NEWSLOCK.C - Added generic code to support programs that want to enable the locking mechanisms. - Changed the code to always exit with a SS$_DEADLOCK status IF exit is forced due to the news system being locked by either the system lock or the NEWS_STOP logical. NEWSRTL.C - Changed the buffer the routine file_copy uses while copying a file. This minimizes the file copy time and maximizes throughput. NEWSSHUTDOWN.C - New program which locks the news database, and invokes a user specified command in the context of a subprocess with the news database locked. FEEDCHECK.C - NEWS Log file analysis program to produce reports on the quality of a site's news feed. FEEDCHECKCMD.CLD - Command definition for the FEEDCHECK program. NEWS_SYSTARTUP.COM_DECUS_UUCP - This is the latest version of the DECUS uucp NEWS_SYSTARTUP.COM NEWSSKIM.COM_DECUS_UUCP = This is the latest version of the DECUS uucp NEWSSKIM.COM NEWS_RNEWS.COM_DECUS_UUCP - This is the latest version of the DECUS uucp NEWS_RNEWS.COM procedure Date: Wed, 26 Feb 92 20:01:52 PST From: rankin@EQL.Caltech.Edu (Pat Rankin) Subject: news 6.1a3 editing changes To: g.huston@aarnet.edu.au Content-Length: 14338 I was unable to force GCC to set aside a "hidden local variable" at a specific stack location, so it's not safe to call VAXC$ESTABLISH when using that compiler. In order to reinstate the old GNUC_HANDLER_HACK code, it was necessary to be able to link with sys$library:vaxcrtl.olb rather than sys$share:vaxcrtl.exe. The best way to accomplish that was to remove all calls to TPU$TPU so that TPUSHR didn't drag VAXCRTL in along with it. What I've done is change the editing code so that both TPU$TPU and EDT$EDIT are activated dynamically, like any other potential callable editor. That has the minor advantage that there are now two fewer shareable images activated at NEWS startup. This has not been exhausting tested; there are bound to be bugs. I did test both REPLY and READ/EDIT for both having MAIL$EDIT defined as CALLABLE_TPU and for SET PROFILE/EDITOR="@local_stuff:MAILEDIT.COM" which invokes a "foreign" editor. I did not attempt to test EDT, or callable_ LSE, or what-have-you... Date: Fri, 30 Aug 91 16:26:20 +1000 Message-Id: <9108300626.AA01260@jatz.aarnet.edu.au> To: _gih900@aarnet.edu.au From: bill@WAIKATO.AC.NZ (by way of G.Huston@aarnet.edu.au) Subject: Message id caching I have written some message caching code (similar to Paul Vixie's (decwrl) msgidd code) which some ANU News users might find useful. What it does is use a global section for the cache, and it stores the last 8192 message ids received, and rejects any that are already in the cache. It can be hooked into the NNTP_SERVER code by changing one line : if (cache_check(l_id) || itm_check(l_id) || hist_check(l_id)) return(1); Because of the global section, it means that all incoming newsfeeds MUST come into a single node. I have a cache master utility as well that can display the cache, and various other counters (such as the number of insertions and rejections). Eg. Number of insertions = 901369 Number of rejections = 630621 Percentage of rejects = 41.2 Current index in cache = 8110 Date counts last reset = 31-MAY-1991 14:47:16.59 (We have multiple feeds from decwrl and wuarchive in USA, and from vuw (Wellington) that gets their feed from Melbourne and uunet - that explains the large "Percentage of rejects"). It also has a validation option which validates all links in the cache, which was useful for debugging and providing hashing statistics. The code also contains exit and error handlers, plus a shareable log file, to make sure a message does not get rejected due to a bug or inconsistent cache. The code has been working without any problems since 31 May. If there is sufficient interest, I can mail the source to someone overseas who is willing to make it available for ftp. The source is too large to post here. I have designed the CACHE.C module and the CACHEM utility to be completely separate from News so they can be used for other caching problems with only one call (ie. cache_check(id)) required by the external program). There is also other code for writing to a shareable log file included which I've found useful for other utilities as well. Bill Teahan --- THE FOLLOWING FILES ARE INCLUDED WITH THIS PACKAGE CACHE.C CACHE.INSTALL CACHEDEFINE.H CACHEM.C CACHE_HASH.MAR CACHE_LOGGER.C CACHE_LOGGER_TEST.C CACHE_MAP.MAR CACHE_TEST.C CACHE_VERIFY.C LOGFAB.H LOGRAB.H