The eLibrary is a component of the Scoop technology SCS recently purchased. eLibrary is now being built on Linux. Scoop's eLibrary uses a proprietary search engine called Surfinity. This component was provided by Scoop only as a binary executable. It runs solely on Windows platforms, specifically the 13 year old Windows Server 2003. Surfinity is obsolete and no longer available for current operating systems. After the Scoop eLibrary source is converted to run on Linux, its search engine still needs to be replaced.
A current, supportable version of eLibrary is needed as part of providing a path forward for existing Scoop users on support. Further, it will be part of BENS (Best Ever Newspaper System.) It might also be leased as a stand alone product. Spice's formula language (FL) supports database tables that have columns that are blobs (usually text fields.) Full text indexing and querying is built into the formula language. The FL has been enhanced to support structured hierarchical objects (SHO). SHO are compound data objects that have a number of desirable properties. These extend beyond the relational database model currently well-supported by the FL. A SHO can be used to manage such multi-component things as advertising super orders and news packages as single variables or table entries. SHO are implemented in the FL using JSON syntax. The FL compiler adds static type declarations to JSON. It enforces type checking for FL SHO and thus our use of JSON. This provides an important safety net for programmers. JSON is the protocol used for passing data among web services. JSON also is the protocol used for SCS's evolving enterprise microservices bus, called MOAI. (Mother Of All Interfaces.) (The MOAI is the key for moving beyond the linear workflows and data silos typically built into today's newspaper systems.) We will deploy new search technology to support eLibrary and SHO in the FL. SCS's current text indexing technology is built using CLucene, a somewhat old, C version of Apache's Lucene. Among search engines, it is regarded as small, efficient, stable and easily deployed. SCS usage includes full text indexing and searching of ads. It is also used as a fast secondary index within SCS/Track (CAS). It and its current usage will stay as is. To support SHO and JSON variables in Spice FL databases, CLucene will be enhanced. So will Spice's FL. Setting up a Lucene index involves field naming and type specifications. The FL will use JSON to pass these mappings to a new CLucene front end. The same will be done with creating, reading, updating and deleting SHO index elements. The choice for the JSON syntax of these transactions will be explained below. Apache Lucene is now provided in JAVA. It is often used as an embedded search engine in technologies such as Solr and Elasticsearch. Elasticsearch is currently the most popular search engine with over 50,000 downloads each month. (Over 20 million so far.) Elasticsearch (ES) supersets Surfinity's functionality so it is a good choice for use in eLibrary. Putting ES (and thus eLibrary) on a dedicated search appliance isolates its huge footprint and technical complexity. This should make marketing and supporting SCS's search technology simpler. The JSON syntax used within Spice's FL for SHO is to be a proper subset of the JSON syntax that ES uses. By this technique Spice programmers will be able to use the indexing tool that best scales to their needs. It also makes transitioning among search engines simpler. Having eLibrary be compatible with ES will allow the many free open-source software (FOSS) query and interfacing tools that work with ES to be used with eLibrary and, perhaps, other SCS applications. Elasticsearch is installed on SCS's networked systems. eLibrary now launches on Linux with access to Elasticsearch. The eLibrary conversion is well underway. The competition was fierce last week, but if you think Trump, Clinton and their fellows were the biggest battle, think again.
For context let me take you back to the early 70's, a time when Linotypes still cast hot metal during newspaper production. I had a relatively recent degree in computer science from the University of Delaware. CS then had three legs: automata theory and formal languages with compiler writing, artificial intelligence (AI) and numerical methods. I brought this with me to the ANPA/RI, the newspaper trade association now called the NAA. The ANPA/RI wanted to paginate newspapers digitally. I designed a laser-based device and wrote software that showed how digital pagination should work. (I still have sample pages printed in 1974.) The ANPA management thought computers were fast, but they had no idea (until I told them) of the computational load entailed in producing pages digitally. The solution was good in theory, but no good in practice. A PDP 11 couldn't put out the bits to dot pages quickly enough. The patent was sold and I was asked to think about what's next. The answer didn't require a lot of thought. To make pages electronically one needed electronic designs, i.e., digital dummies or ad layout geometries. Without these, manual dummying would necessitate a very inefficient workflow. Thus were the origins of the Layout series of programs established. The project fit nicely into my skill set. AI was my favorite academic area. Display ad dummying was another rule-based search problem, just like the chess programs I wrote as a post-graduate. Back to last week's world changing competition: Lee Sedol vs. AlphaGo for the world title in Go. This was not just another man vs. computer competition where superior calculating ability wins. Go isn't like chess. Anyone who knows Go will tell you that it is a game of supreme intuition and creativity. In Go the possibilities seem unfathomable and far beyond those of other well-known perfect-information board games. Immense complexity arises from Go's quite simple rules. Google's AlphaGo from its DeepMind subsidiary beat Go maste and world title holder Lee Sedol decisively 4 to 1 in the five game match. Throughout Asia tens of millions watched the match on TV and the Internet. Read more here about how AlphaGo’s play reached a level “close to the territory of divinity” according to the Korea Baduk Association. So what does this have to do with newspaper productivity growth? AlphaGo's programming was unlike others. AlphaGo used neural networks and thousands of master games to learn the subtle patterns of master level play. Soon its best opponent was a prior version of itself. The new versions producing improved results were selected over their predecessors. If you think about it, you should realize that many analytic tasks, even those requiring creativity and intuition, are now within the grasp of machine learning. Will machines diagnose disease better than doctors? Will human financial advisers be bested by algorithms? Will the best composers be software? Which jobs will future robots replace? Mark my words: last week we witnessed a modern John Henry moment. And, by the way, ever since version 14, Layout-8000™ has gathered data on how its operators use it for both automated and manual dummying. Hiding behind the name LayoutHistoryAdBoss is a pattern database and learning algorithm designed to enable further efficiencies. It encapsulates human dummying expertise. It is being deployed at the design centers of large newspaper groups. With LHAB one gets hints as to where advertisers wouldn't want their ads. What's left is good enough and, through learning, Layout-8000 becomes even better, smarter and more automated. Seems like there will soon be a machine intelligence revolution that will rival the industrial revolution. Before discussing platforms, obsolete or otherwise, an explanation of SCS's business models might help in clarifying our evolving choices.
Our best-of-breed applications are in use at major newspaper groups, numerous metro newspapers and many mid-market newspapers. Our enterprise-wide systems are deployed at mid-market newspapers. For all of them our systems are deployed on on-premise servers or local cloud systems. Because we adapt to their changing requirements, nearly all customers faithfully subscribe to our support. They seem to well appreciate the service we provide. We saw an under-served market among large weeklies and smaller-market dailies, where non-corporate decision makers are both looking for solutions and are approachable. There are literally thousands of these. Many have such good franchises that they can afford new technology, especially if it can help them do more with less. Our new business model is to provide both software and hardware under lease and manage everything from our office remotely. We are making excellent progress with this. Currently the statuses of over three dozen appliance servers at customer sites are displayed here using Nagios. Every 15 minutes statuses are updated. We often attend to issues before the customer is aware of them. We wish to grow this model since we see providing quality support as our strongest unique selling advantage. What are large weeklies and smaller-market dailies crying out for? Functionality and adaptability - How can they get the features that big newspapers use and can afford but on a much smaller budget? Maintainability - Doing more with less is hard if it requires an expensive IT staff. Maintenance is best outsourced to specialists like us. Reliability - No moving parts. No tape drives, disk drives or other mechanical storage devices. Good platforms have sufficient redundancy so that they have no single points of failure. Ample capacity - Enough processing power and storage to meet not just everyday requirements but peak demands. We found what we think is a near perfect solution for today's newspaper server platforms. I'm not sure Intel appreciates how good their NUC (Next Unit of Computing) devices are. NUCs are sold as easy-to-configure kits. I think Intel thinks their primary use might be for high end gaming, but you couldn't ask for a nicer appliance for a network of small servers. For around $1,300 we get enough computing power to run our applications at small and medium-sized newspapers. How much power? Multi-core 3GHz processors, 16-32 GB of RAM, SSD storage to 3TB, versatile connectivity, etc., all in a 4"x4"x2" case. And it gets better with Linux compatibility, very low power requirements, a 3 year warranty and operation without expensive environmental devices. They're closet compatible. Each can support up to 75 users and they are expandable group-wide through easy networking. We provide NUCs in pairs as local cloud appliances with optional remote near-realtime backup systems. NUCs enabled us to transform from an ISP (independent software provider) and VAR (value added reseller) into a MSP (managed services provider). Most importantly we were able to do this without experiencing significant negative cash flow for initial equipment purchases. Picking appropriate technology enables business transformations. |
Richard J. Cichelli
SCS President & CEO Archives
January 2019
Categories
|