COMPUTER SCIENCE DEPARTMENT

The past year—the second inhabiting our new space in TCL—was one of continued growth and activity for the department. As the result of a very successful search, we are delighted to be welcoming Stephen Freund to our faculty this coming fall. Steve received his Ph.D. from Stanford a few years ago, and has been pursuing his research in programming languages at the Compaq Systems Research Center in Palo Alto. The class of rising senior CS majors numbers 25, the largest cohort we’ve seen yet. Our facilities continue to expand and improve: Professor James Teresco’s 21-processor Sun Microsystems parallel computer cluster is up and running and has already been used in a variety of research and teaching projects. This summer we will be upgrading our lab for introductory courses with Macintosh 900 MHz G4 towers running system OS X.
Professors Bruce, Danyluk and Murtagh, continued their work on developing new materials for the innovative introductory course developed and taught at Williams over the last three years. Supported by a $287,892 grant from the NSF CCLI program, this new approach uses Java as the programming language, and places a heavy emphasis on event-driven programming and concurrency early in the course. The grant supports the writing of a textbook for this course as well as further refinement of the ‘objectdraw’ library and other supporting software for the course. Krishna Kannan ’03 worked with the faculty on software for the course in the summer of 2001. In August 2001, several faculty from other schools came to Williams to learn how to use our materials in preparation for teaching courses based on this material in the 2001-2002 academic year. In the summer of 2002, another 15 faculty will be coming for a four-day workshop on using these materials. Several chapters of the book are now completed, with most of the others to be completed by the end of the summer of 2002.
Bruce, Danyluk, and Murtagh made a number of presentations on their approach this year. Bruce presented their paper “Event-Driven Programming Facilitates Learning Standard Programming Concepts” at the Fifth Workshop on Pedagogies and Tools for Assimilating Object Oriented Concepts at OOPSLA ’01. All three gave a tutorial on the approach at the SIGCSE 2002 Symposium on Computer Science Education, held in March.
Professor Duane Bailey worked with two seniors, Todd Gamblin and Feng Zhu. Bailey and Gamblin studied possible mechanisms to develop cache-conscious memory allocation in Java (this work builds on research with Art Munson '01). Because Java is based on a virtual machine abstraction, many of the traditional performance-enhancing optimizations cannot be directly targeted to modern hardware; this research addresses that disconnect. Work with Zhu investigated an aperiodic tiling of the plane using a single tile. Their tile is constructed from fractal-like regions that allow two or more adjacent tiles to intermingle. Currently, at least two non-fractal tiles are required to tile the plane aperiodically.
This summer, Bailey and Christopher Cyll '04 are continuing work on a web-based graphical modeling environment. The hope is to develop a free, language-based system for generating three-dimensional models that ultimately may be remotely rendered using professional engines. The work may the basis for courseware to be used in spring of 2004.
Bailey is working with the Foundation for Excellent Schools under an Arthur Vining Davis grant. The project pairs college-level instructors and high school teachers in an effort to revitalize curricula in challenging high school environments. Bailey is working with mathematics faculty at the Academy of Environmental Science and Theodore Roosevelt High, both in New York City. The collaboration will develop a web site of resources over a two-year period.
Bailey is currently a reviewer on the NSF's panel for CCLI grants in Computer Science. These grants fund research into and implementation of new curricular approaches to teaching computer science. Bailey was a reader for the Goldwater Scholarship, a congressionally funded award given to best sophomore and junior science research students in the nation. He is also reading applications for the Jack Kent Cooke Undergraduate Scholarship in its inaugural year. The second edition of Bailey's book, Java Structures, is due out in mid-July.
Professor Kim Bruce continued his research on the type theory and semantics of object-oriented languages. The overall goal of his research is to increase the understanding of the key concepts of object-oriented languages in order to promote the design of more secure and expressive object-oriented languages.
Bruce's main research work during the last several years involved writing a monograph titled Foundations of Object-Oriented Languages: Types and Semantics. The book, a careful account of research over the last ten years on the types and semantics of class-based object-oriented languages, was published in February by MIT Press. Bruce presented a 3-hour tutorial on material from the book at the European Conference on Object-Oriented Programming, held in Malaga, Spain, in June 2002.
During the summer of 2001, John “Nate” Foster ’01 worked with Bruce on extending Sun's Java compiler to support a language, LOOJ, with a more expressive type system than Java. This work was an extension of Nate’s honors thesis. The summer work involved porting the language to a compiler built using Sun's newest Java compiler and solidifying the implementation of some of the more exotic language features involving parametric polymorphism and the ThisType and ThisClass constructs.
Cheng Hu ’03 also worked with Bruce in the summer of 2001 on a project that will eventually result in modifying the Java Virtual Machine Language to better support parametric polymorphism and other advanced language features. Rob Gonzalez ’03 will continue this work in the summer of 2002.
Doug Thunen ’02 worked with Bruce this year on extending and refining the module system of the locally designed and implemented object-oriented language, LOOM. This work resulted in a language in which modules can have multiple interfaces depending on the needs of the programmer. The compiler for this language now generates code for a typed intermediate language that is itself interpreted. Diane Bennett ’03 will continue this project in the summer of 2002 by implementing more features of the language in the compiler and re-targeting the compiler to the FLINT compiler back-end supported by researchers at Yale University.
Bruce presented a paper co-authored with Professors Danyluk and Murtagh, “Event-Driven Programming Facilitates Learning Standard Programming Concepts,” at the OOPSLA 2001 Fifth Workshop on Pedagogies and Tools for Assimilating Object-Oriented Concepts, in October 2001. Bruce is co-organizer of the similar ECOOP 2002 Sixth Workshop on Pedagogies and Tools for Learning Object-Oriented Concepts in Malaga, Spain, in June 2002.
Bruce was involved in several other activities through the last academic year. He served as outside evaluator in two tenure and promotion cases this fall, and served on the Ph.D. committee of a graduate student at Yale University. He also participated on a NSF Panel evaluating grant proposals in Washington, D.C. He served on a visiting committee for the Mathematics and Computer Science Department at Middlebury College in March 2001. He continues to serve on the steering committee for the International Conference on Functional Programming. He attended the OOPSLA 2001 meeting in Tampa in October; the POPL 2002 and FOOL 9 meetings in Portland, Oregon in January 2002; the SIGCSE meeting in Northern Kentucky in February 2002; and the ECOOP 2002 meeting in Malaga, Spain in June.
On campus, he finished a two-year term on the college Committee on Appointments and Promotions, and will begin a term on the Faculty Steering Committee in July 2002. He also served as local arrangements chair for the 2001 meeting of the Liberal Arts Computer Science Consortium, which met at Williams in August 2001. He will serve as convener of the 2002 summer meeting of the same organization at Grinnell College in Iowa.
Associate Professor Andrea Danyluk continued her research in Artificial Intelligence, specifically in the area of Machine Learning. Danyluk’s research focus is on the application of inductive machine learning algorithms. Inductive learning algorithms extrapolate general rules or trends from large data sets. One might consider applying such algorithms to a database of cancer patients’ records, for instance, to determine general rules that might predict the recurrence of a particular type of cancer.
Evan Sandhaus did an honors thesis this year under the supervision of Danyluk. Evan investigated the utility of a developmental ordering and presentation of inputs to a neural network. Neural networks are able to encode arbitrary functions. Often, they are used to model classifier systems, i.e., systems where a classification (the output) for an object needs to be determined from a set of observables (the input). Building a neural network classification system can be hard. In some cases, the classification problem is so hard that experts in the area have difficulty articulating the rules for doing classification. In other cases, a classifier is desired, but the problem is not sufficiently well specified to build one. In these cases, a neural network can be trained by exposing it to a series of examples of inputs and associated outputs. Typically, the input examples are shuffled so that they may be presented to the network in random order. The rationale for doing this is the concern that a particular ordering might bias the neural network to learn a suboptimal classifier. There has been some evidence, however, that a carefully crafted ordering of input examples can lead a neural network to learn a better final classifier. In particular, it has been suggested that a simple-to-complex ordering of the examples, as determined by an expert in the application area, can lead to better classifiers. Evan tested the generality of this claim by devising an algorithm for ranking examples and then devising several different algorithms for presenting these to a neural network for training. His extensive tests cast doubt on the general utility of a simple-to-complex ordering, and it will be interesting to explore this outcome with even more tests.
In the summer of 2001, Danyluk was co-chair of ICML-2001, the Eighteenth International Conference on Machine Learning, together with Professor Carla Brodley of Purdue University. This conference brought 250 leading researchers from around the world to Williams to present their work and to exchange ideas. The conference began with a day of tutorials and workshops, followed by three days of technical talks and poster sessions. The conference proceedings, containing approximately 80 papers, have been published by Morgan Kaufmann. Danyluk and Professor Brodley are now editing a special issue of the Journal of Machine Learning Research containing expanded versions of selected research presented at the conference.
Danyluk is now serving as a member of the advisory committee for ICML-2002, which will be held in Sydney, Australia. This year she also served as a member of the program committee for ICDM ’01 (the 2001 IEEE International Conference on Data Mining) and for DMLL-2002 (the First International Workshop on Data Mining Lessons Learned, to be held in conjunction with ICML 2002). She is also a member of the panel for the Seventh SIGART/AAAI Doctoral Consortium to be held at the Eighteenth National Conference on Artificial Intelligence in July 2002, and served as a reviewer for the Journal of Machine Learning Research and for Wiley & Sons.
Professor Bill Lenhart returned from a semester sabbatical to resume his role as department chair. He continued his research in graph theory and graph drawing. In February, Lenhart presented a lecture entitled “Geometric Graphs, Graph Drawing and Minimum Weight Triangulations” at Pomona College. Lenhart and one of his research collaborators, Professor Ryan Hayward of the University of Edmonton, were invited to submit a paper entitled “Bichromatic P4 Composition Schemes for Perfectly Orderable Graphs” to a special issue of the journal Discrete Applied Mathematics dedicated to the conference Brazilian Symposium on Graphs, Algorithms and Combinatorics 2001 in Fortaleza, Brazil. The paper is currently being reviewed. This summer Lenhart will be attending the ACM SIGGRAPH conference on computer graphics in San Antonio.
Among his other activities, Bill served as an external evaluator for a professor under consideration for promotion to full professor. He also served as a National Science Foundation panelist for their recent Director’s Award for Distinguished Teaching Scholars.
This spring, Lenhart was delighted to learn that he had been appointed the A. Barton Hepburn Professor of Computer Science, effective July 1, 2002. Of course, his greatest accomplishment this past year was that of becoming the father of Zeta Lenhart-Boyd (pictures available upon request).
Assistant Professor Barbara Lerner is continuing her research into the coordination of human and automated agents in collaboration with the Laboratory for Software Engineering Research at the University of Massachusetts, Amherst. In the past summer, two Williams students participated in this research.
Evan Sandhaus ’02 developed a meeting scheduling process using the Little-JIL coordination technology to enable the communication between multiple people scheduling a meeting and an automated agent that determined the most appropriate time for the meeting given the constraints of the participants. The main purpose of this project was to explore integration of this coordination technology into a graphical, domain-specific user interface, thereby making the coordination a natural activity performed by the user.

RepSci200209.jpg

Chris Cyll '04 discusses his work on Little-JIL
with advisor Professor Barbara Lerner.

Chris Cyll ’04 developed infrastructure to allow Little-JIL coordination to be applied to the coordination of players in a simulated soccer game using the software developed for the annual Robocup competition.
Professor Thomas Murtagh’s interest in the introductory curriculum has included efforts to design an introductory course that places more emphasis on exposing students to the nature of our discipline than on the mechanics of computer programming. He has continued to develop one such course, CSCI 105, that the department offers primarily to meet the needs of those not intending to pursue a major in computer science. He is also investigating ways to adapt such a course for use as an integral part of the sequence for majors. He presented a paper describing this work entitled “Teaching Depth-First Breadth-First” at the 6th Annual Conference on Innovation and Technology in Computer Science Education.
Murtagh began investigating schemes for adapting the congestion control mechanisms associated with the TCP protocol to provide better support for the short-lived interactions typically associated with widely used TCP-based applications. This year, he worked with Samantha Orme ’02 to quantify the nature of these interactions by gathering statistical information about the duration and frequency of various types of network connections.
Assistant Professor James Teresco continued his research on parallel and distributed computing, specifically in the area of dynamic load balancing for adaptive scientific computation. Parallel computation is a well-established and essential tool for large-scale scientific computing. Adaptive approaches, such as the automatic refinement and coarsening of meshes or the variation of method order, often necessitate linked structures that must be explicitly partitioned into sub-domains that can be distributed among processors, assigning equal work to each while minimizing communication. A balanced workload, achieved by a static partitioner, will cease to remain so under adaptive refinement, and will require redistribution during the solution process using a dynamic load balancing procedure.
Teresco is working with Sandia National Laboratories on the development of Zoltan, a software library that provides a common interface to a number of state-of-the-art partitioning and dynamic load balancing algorithms. He traveled to Albuquerque, New Mexico, twice in the past year to meet with the Zoltan group. Teresco’s work on Zoltan is in collaboration with Paul Campbell (Rensselaer and IBM), Karen Devine (Sandia), Joseph Flaherty (Rensselaer), Jamal Faik (Rensselaer), and Luis Gervasio (Rensselaer). They incorporated into Zoltan an octree load balancing procedure that uses an octree structure to cover a computational domain, dividing the work spatially among “terminal octants” (leaf nodes of the tree). A tree traversal is used to divide the octants among partitions to be distributed on a parallel computer. Recent work includes a persistent tree structure to decrease the cost of the algorithm, and the use of space-filling curves to linearize the terminal octants in a way such that the quality of the resulting partitions, and in turn, the efficiency of the computation, is improved. This work is the focus of the manuscript “Dynamic Octree Load Balancing Using Space-Filling Curves,” co-authored with Campbell, Devine, Flaherty, and Gervasio, and recently submitted to the Journal of Parallel and Distributed Computing. It was also the focus of a poster “Octree Load Balancing,” at theSymposium on Multi-Scale Computation: Simulation and Modeling that is Changing the Face of Scientific Inquiry, in Troy, New York, on November 8, 2001.
Modern parallel processing is being performed on everything from the largest tightly coupled supercomputers to clusters of workstations. Any adaptive strategy seeking optimal performance must account for processor, memory, and communications hierarchy and heterogeneity. Hierarchical and heterogeneous systems are increasingly common, and present new challenges for the development of efficient software, particularly influencing dynamic load balancing procedures. Teresco continues to work with Devine, Faik, Flaherty, and Gervasio, to support architecture-aware load balancing in Zoltan. Current work includes the automatic construction of a machine model that represents the computational domain, and on algorithms that use this model to guide load balancing. The machine model is based on one developed by Teresco as part of his Ph.D. dissertation. This work was included in Teresco's presentation “Hierarchical Programming and Dynamic Load Balancing” at the Sixth U.S. National Congress on Computational Mechanics (USNCCM IV) in Dearborn, Michigan, on August 3, 2001. This work was the focus of a poster “Architecture-Oriented Load Balancing” at the Symposium on Multi-Scale Computation: Simulation and Modeling that is Changing the Face of Scientific Inquiry, in Troy, New York, on November 8, 2001.
Teresco worked with Kai Chen ’04 during the summer of 2001 to develop a Java interface to PMDBtool. PMDBtool is a software package developed by Teresco that can refine and partition meshes, and gather statistics about the resulting partitions, allowing load balancing procedures to be compared for a variety of problems. Chen’s interface allows researchers a convenient way to choose the parameters to PMDBtool, run the program, and visualize the results. Chen’s work was included in the presentation at USNCCM IV.

RepSci200210.jpg

An early stage of construction of Professor Jim Teresco's
now completed parallel computing cluster.

Teresco and Flaherty were awarded a grant entitled “Dynamic Data Management and Load Balancing for Parallel Adaptive Computation,” to support this research. Teresco, Flaherty, and Jean-Francois Remacle (Rensselaer) have submitted a grant proposal “Hierarchical Dynamic Data Management and Load Balancing for Parallel Adaptive Computation,” to Sandia National Laboratories to continue this research for the upcoming year.
Teresco has installed, configured, and continues to maintain a 21-processor Sun Microsystems computer cluster that supports researching parallel and distributed computing in the Computer Science Department. The cluster consists of nine Enterprise servers, connected by fast Ethernet and by a gigabit interconnect from Dolphin. In addition to supporting the research described above, the cluster has been used by Steve Biller ’02, Joe Bergeron ’02, and Evan Sandhaus ’02, to run computations using large data sets for their research work. The cluster was also used in CSCI 432,Operating Systems, during fall 2001 for a project to study multithreading.
Teresco also participated in the peer review process for a grant proposal for the U.S. Army Research Laboratory.

Class of 1960 Scholars in Computer Science
Samantha Orme
Teodora Ivanova
Diane Bennett
Evan Sandhaus
Lida Ungar
Rob Gonzalez
COMPUTER SCIENCE COLLOQUIA
Associate Professor Tim Teitelbaum, Department of Computer Science, Cornell University, Co-founder CEO, GrammaTech, Inc.
“Dependence-Graphs and Program Slicing”
Dr. Jeremy Kepner, Massachusetts Institute of Technology Lincoln Laboratory
“Programming Parallel Computers for Real-Time Stream Processing”
Department of Computer Science Faculty, Williams College
“Discussion on Graduate Schools”
Professor George Heineman, Computer Science Department, Worcester Polytechnic Institute
“Ask Not What a Component Model Can Do for You. . . ”
Professor Kim Bruce, Department of Computer Science, Williams College
“Safety versus Expressiveness in Programming”
Sam Natarajan ’82, CEO, Harvest Technology; Angela M. Schuett ’94, Computer Systems Researcher, National Security Agency; Nick Branstator ’95, Vice-President of Production, Radio Voodoo; Jay Sach, Eziba; Joe Bergeron ’02, WITSway; Luis O'Shea, Vice-President of Software Development, Clickshare
“Careers in Computer Science”
Professor Charles V. Stewart ’82, Department of Computer Science, Rensselaer Polytechnic Institute
“The Retina Project at Rensselaer: Image Analysis Algorithms for Assisting in the Diagnosis and Treatment of Retinal Diseases”
Professor Michael Ernst, MIT Lab for Computer Science
“Two Applications of Dynamic Invariant Detection: Refactoring and Static Verification”
Daniel Fasulo, ’94, Celera Genomics
“A Whirlwind Tour of Bioinformatics”
Professor Michael Zuker, Mathematical Sciences, Rensselaer Polytechnic Institute
“Algorithms and Statistics for Nucleic Acid Secondary Structure Prediction”
Neil Heffernan, Ph.D., Carnegie Mellon University
“Intelligent Tutoring Systems Have Forgotten the Tutor”
Stephen Freund, Ph.D., Compaq Systems Research Lab
“Race Detection for Java Programs”
Richard Wicentowski, Ph.D., Johns Hopkins University
“Resource-Limited Machine Translation and Computational Morphology”
Christopher H. Lee, Ph.D., Field and Space Robotics Laboratory, MIT
“Modeling Gestures for Recognition and Robotics”
Professor David Garlan, Carnegie Mellon University
“Software Architecture: Past, Present and Future”
Professor David Garlan, Carnegie Mellon University, Class of 1060 Scholars Program
“Software Challenges for Ubiquitous Computing”
Michael Gousie, Department of Mathematics and Computer Science, Wheaton College
“Building a Surface Reconstruction Research Environment”
Professor Mauricio Marengoni, Department of Computer Science, Massachusetts College of Liberal Arts
“Dealing with Uncertainty in a Vision System”
David Detlefs, Sun Microsystems Laboratories
“Issues of Parallelism and Concurrency in Scalable Java Garbage Collection”
Dr. Wesley D. Turner, Computer Scientist – Visualization, GE Global Research
“Visualization Research in a Corporate Environment”

COMPUTER SCIENCE STUDENT COLLOQUIA
Joe Bergeron ’02, Todd Gamblin ’02, Evan Sandhaus ’02, Christopher Cyll ’04, Feng Zhu ’02, Josh Ain ’03,
Rob Gonzalez ’03
“Computer Science Majors Talk about Their Summer Work Experiences”
Feng Zhu ’02, Senior Thesis Talk
“The Search for Universal Aperiodic Tile”
Evan Sandhaus ’02, Senior Thesis Talk
“Improving the Performance of Neural Networks through the Developmental Ordering and Presentation of Input Examples”
Douglas Thunen ’02, Senior Thesis Talk
“Modules in LOOM and Their (Separate) Compilation”
Feng Zhu ’02, Senior Thesis Defense
“The Search for a Universal Tile”
Evan Sandhaus ’02, Senior Thesis Defense
“The Developmental Ordering and Presentation of Input Examples”
Douglas Thunen ’02, Senior Thesis Defense

OFF-CAMPUS COLLOQUIA
Kim Bruce, Andrea Danyluk and Tom Murtagh
“Event-Driven Programming Facilitates Learning Standard Programming Concepts”
Fifth Workshop on Pedagogies and Tools for Assimilating Object-Oriented Concepts, at OOPSLA 2001,
Tampa, Florida
William J. Lenhart
“Geometric Graphs, Graph Drawing and Minimum Weight Triangulations”
Pomona College
James Teresco
“Hierarchical Programming and Dynamic Load Balancing"
Sixth U.S. National Congress on Computational Mechanics, Dearborn, Michigan, August 3, 2001.
“Octree Load Balancing”
“Architecture-Oriented Load Balancing”
Symposium on Multi-Scale Computation:
“Simulation and Modeling That Is Changing the Face of Scientific Inquiry”
Troy, New York, November 8, 2001.
Kim B. Bruce
“A Library to Support a Graphics-Based Object-First Approach to CS 1”
SIGCSE 2001, February 2001, Charlotte, North Carolina
Kim B. Bruce (with Professors Andrea Danyluk and Tom Murtagh)
“Events and Objects First: An Innovative Approach to Teaching Java in CS 1”
Workshop at CCSCNE, April 2001, Middlebury College

POSTGRADUATE PLANS OF DEPARTMENT MAJORS
Joseph C. Bergeron
Research at Williams College, Williamstown, MA
Benjamin M. Birney
Broadway in Chicago, a Nederlander Organization and is composing a musical he hopes will be produced
Jason C. Chapman
Undecided
Jesse J. Davis
Undecided
Benjamin Doob
Undecided
Todd G. Gamblin
ValueCommerce, Tokyo, Japan
Michael B. Gross
Undecided
Benjamin Isecke
Undecided
Max F. Niederste-Ostholt
A.T. Kearney, a Strategic Consulting Company; Düsseldorf, Germany
Samantha L. Orme
Undecided
Evan S. Sandhaus
Undecided
Douglas K. Thunen
Graduate School at Princeton University, NJ
Lida P. Ungar
Research at Johns Hopkins University
Willie Wu
Travel, Graduate school in Fall 2003
Feng Zhu
Will pursue a Ph.D. in Information Technology and Management at Harvard Business School, Cambridge, MA