Tag Archives: open source

A World Digital Library Is Coming True! by Robert Darnton | The New York Review of Books

A World Digital Library Is Coming True! by Robert Darnton | The New York Review of Books.

darnton_2-052214.jpg

In the scramble to gain market share in cyberspace, something is getting lost: the public interest. Libraries and laboratories—crucial nodes of the World Wide Web—are buckling under economic pressure, and the information they diffuse is being diverted away from the public sphere, where it can do most good.

Not that information comes free or “wants to be free,” as Internet enthusiasts proclaimed twenty years ago.1 It comes filtered through expensive technologies and financed by powerful corporations. No one can ignore the economic realities that underlie the new information age, but who would argue that we have reached the right balance between commercialization and democratization?

Consider the cost of scientific periodicals, most of which are published exclusively online. It has increased at four times the rate of inflation since 1986. The average price of a year’s subscription to a chemistry journal is now $4,044. In 1970 it was $33. A subscription to the Journal of Comparative Neurology cost $30,860 in 2012—the equivalent of six hundred monographs. Three giant publishers—Reed Elsevier, Wiley-Blackwell, and Springer—publish 42 percent of all academic articles, and they make giant profits from them. In 2013 Elsevier turned a 39 percent profit on an income of £2.1 billion from its science, technical, and medical journals.

All over the country research libraries are canceling subscriptions to academic journals, because they are caught between decreasing budgets and increasing costs. The logic of the bottom line is inescapable, but there is a higher logic that deserves consideration—namely, that the public should have access to knowledge produced with public funds.

[…]

The struggle over academic journals should not be dismissed as an “academic question,” because a great deal is at stake. Access to research drives large sectors of the economy—the freer and quicker the access, the more powerful its effect. The Human Genome Project cost $3.8 billion in federal funds to develop, and thanks to the free accessibility of the results, it has already produced $796 billion in commercial applications. Linux, the free, open-source software system, has brought in billions in revenue for many companies, including Google.

[…]

According to a study completed in 2006 by John Houghton, a specialist in the economics of information, a 5 percent increase in the accessibility of research would have produced an increase in productivity worth $16 billion.

[…]

Yet accessibility may decrease, because the price of journals has escalated so disastrously that libraries—and also hospitals, small-scale laboratories, and data-driven enterprises—are canceling subscriptions. Publishers respond by charging still more to institutions with budgets strong enough to carry the additional weight.

[…]

In the long run, journals can be sustained only through a transformation of the economic basis of academic publishing. The current system developed as a component of the professionalization of academic disciplines in the nineteenth century. It served the public interest well through most of the twentieth century, but it has become dysfunctional in the age of the Internet.

[…]

The entire system of communicating research could be made less expensive and more beneficial for the public by a process known as “flipping.” Instead of subsisting on subscriptions, a flipped journal covers its costs by charging processing fees before publication and making its articles freely available, as “open access,” afterward. That will sound strange to many academic authors. Why, they may ask, should we pay to get published? But they may not understand the dysfunctions of the present system, in which they furnish the research, writing, and refereeing free of charge to the subscription journals and then buy back the product of their work—not personally, of course, but through their libraries—at an exorbitant price. The public pays twice—first as taxpayers who subsidize the research, then as taxpayers or tuition payers who support public or private university libraries.

By creating open-access journals, a flipped system directly benefits the public. Anyone can consult the research free of charge online, and libraries are liberated from the spiraling costs of subscriptions. Of course, the publication expenses do not evaporate miraculously, but they are greatly reduced, especially for nonprofit journals, which do not need to satisfy shareholders. The processing fees, which can run to a thousand dollars or more, depending on the complexities of the text and the process of peer review, can be covered in various ways. They are often included in research grants to scientists, and they are increasingly financed by the author’s university or a group of universities.

[…]

The main impediment to public-spirited publishing of this kind is not financial. It involves prestige. Scientists prefer to publish in expensive journals like Nature, Science, and Cell, because the aura attached to them glows on CVs and promotes careers. But some prominent scientists have undercut the prestige effect by founding open-access journals and recruiting the best talent to write and referee for them. Harold Varmus, a Nobel laureate in physiology and medicine, has made a huge success of Public Library of Science, and Paul Crutzen, a Nobel laureate in chemistry, has done the same with Atmospheric Chemistry and Physics. They have proven the feasibility of high-quality, open-access journals. Not only do they cover costs through processing fees, but they produce a profit—or rather, a “surplus,” which they invest in further open-access projects.

[…]

DASH now includes 17,000 articles, and it has registered three million downloads from countries in every continent. Repositories in other universities also report very high scores in their counts of downloads. They make knowledge available to a broad public, including researchers who have no connection to an academic institution; and at the same time, they make it possible for writers to reach far more readers than would be possible by means of subscription journals.

The desire to reach readers may be one of the most underestimated forces in the world of knowledge. Aside from journal articles, academics produce a large numbers of books, yet they rarely make much money from them. Authors in general derive little income from a book a year or two after its publication. Once its commercial life has ended, it dies a slow death, lying unread, except for rare occasions, on the shelves of libraries, inaccessible to the vast majority of readers. At that stage, authors generally have one dominant desire—for their work to circulate freely through the public; and their interest coincides with the goals of the open-access movement.

[…]

All sorts of complexities remain to be worked out before such a plan can succeed: How to accommodate the interests of publishers, who want to keep books on their backlists? Where to leave room for rights holders to opt out and for the revival of books that take on new economic life? Whether to devise some form of royalties, as in the extended collective licensing programs that have proven to be successful in the Scandinavian countries? It should be possible to enlist vested interests in a solution that will serve the public interest, not by appealing to altruism but rather by rethinking business plans in ways that will make the most of modern technology.

Several experimental enterprises illustrate possibilities of this kind. Knowledge Unlatched gathers commitments and collects funds from libraries that agree to purchase scholarly books at rates that will guarantee payment of a fixed amount to the publishers who are taking part in the program. The more libraries participating in the pool, the lower the price each will have to pay. While electronic editions of the books will be available everywhere free of charge through Knowledge Unlatched, the subscribing libraries will have the exclusive right to download and print out copies.

[…]

OpenEdition Books, located in Marseille, operates on a somewhat similar principle. It provides a platform for publishers who want to develop open-access online collections, and it sells the e-content to subscribers in formats that can be downloaded and printed. Operating from Cambridge, England, Open Book Publishers also charges for PDFs, which can be used with print-on-demand technology to produce physical books, and it applies the income to subsidies for free copies online. It recruits academic authors who are willing to provide manuscripts without payment in order to reach the largest possible audience and to further the cause of open access.

The famous quip of Samuel Johnson, “No man but a blockhead ever wrote, except for money,” no longer has the force of a self-evident truth in the age of the Internet. By tapping the goodwill of unpaid authors, Open Book Publishers has produced forty-one books in the humanities and social sciences, all rigorously peer-reviewed, since its foundation in 2008. “We envisage a world in which all research is freely available to all readers,” it proclaims on its website.

[…]

Google set out to digitize millions of books in research libraries and then proposed to sell subscriptions to the resulting database. Having provided the books to Google free of charge, the libraries would then have to buy back access to them, in digital form, at a price to be determined by Google and that could escalate as disastrously as the prices of scholarly journals.

Google Book Search actually began as a search service, which made available only snippets or short passages of books. But because many of the books were covered by copyright, Google was sued by the rights holders; and after lengthy negotiations the plaintiffs and Google agreed on a settlement, which transformed the search service into a gigantic commercial library financed by subscriptions. But the settlement had to be approved by a court, and on March 22, 2011, the Southern Federal District Court of New York rejected it on the grounds that, among other things, it threatened to constitute a monopoly in restraint of trade. That decision put an end to Google’s project and cleared the way for the DPLA to offer digitized holdings—but nothing covered by copyright—to readers everywhere, free of charge.

Aside from its not-for-profit character, the DPLA differs from Google Book Search in a crucial respect: it is not a vertical organization erected on a database of its own. It is a distributed, horizontal system, which links digital collections already in the possession of the participating institutions, and it does so by means of a technological infrastructure that makes them instantly available to the user with one click on an electronic device. It is fundamentally horizontal, both in organization and in spirit.

Instead of working from the top down, the DPLA relies on “service hubs,” or small administrative centers, to promote local collections and aggregate them at the state level. “Content hubs” located in institutions with collections of at least 250,000 items—for example, the New York Public Library, the Smithsonian Institution, and the collective digital repository known as HathiTrust—provide the bulk of the DPLA’s holdings. There are now two dozen service and content hubs, and soon, if financing can be found, they will exist in every state of the union.

Such horizontality reinforces the democratizing impulse behind the DPLA. Although it is a small, nonprofit corporation with headquarters and a minimal staff in Boston, the DPLA functions as a network that covers the entire country. It relies heavily on volunteers. More than a thousand computer scientists collaborated free of charge in the design of its infrastructure, which aggregates metadata (catalog-type descriptions of documents) in a way that allows easy searching.

Therefore, for example, a ninth-grader in Dallas who is preparing a report on an episode of the American Revolution can download a manuscript from New York, a pamphlet from Chicago, and a map from San Francisco in order to study them side by side. Unfortunately, he or she will not be able to consult any recent books, because copyright laws keep virtually everything published after 1923 out of the public domain. But the courts, which are considering a flurry of cases about the “fair use” of copyright, may sustain a broad-enough interpretation for the DPLA to make a great deal of post-1923 material available for educational purposes.

A small army of volunteer “Community Reps,” mainly librarians with technical skills, is fanning out across the country to promote various outreach programs sponsored by the DPLA. They reinforce the work of the service hubs, which concentrate on public libraries as centers of collection-building. A grant from the Bill and Melinda Gates Foundation is financing a Public Library Partnerships Project to train local librarians in the latest digital technologies. Equipped with new skills, the librarians will invite people to bring in material of their own—family letters, high school yearbooks, postcard collections stored in trunks and attics—to be digitized, curated, preserved, and made accessible online by the DPLA. While developing local community consciousness about culture and history, this project will also help integrate local collections in the national network.

[…]

In these and other ways, the DPLA will go beyond its basic mission of making the cultural heritage of America available to all Americans. It will provide opportunities for them to interact with the material and to develop materials of their own. It will empower librarians and reinforce public libraries everywhere, not only in the United States. Its technological infrastructure has been designed to be interoperable with that of Europeana, a similar enterprise that is aggregating the holdings of libraries in the twenty-eight member states of the European Union. The DPLA’s collections include works in more than four hundred languages, and nearly 30 percent of its users come from outside the US. Ten years from now, the DPLA’s first year of activity may look like the beginning of an international library system.

It would be naive, however, to imagine a future free from the vested interests that have blocked the flow of information in the past. The lobbies at work in Washington also operate in Brussels, and a newly elected European Parliament will soon have to deal with the same issues that remain to be resolved in the US Congress. Commercialization and democratization operate on a global scale, and a great deal of access must be opened before the World Wide Web can accommodate a worldwide library.

Advertisements

Adobe and Google Debut Typeface Family of Asian Languages

Adobe and Google Debut Typeface Family of Asian Languages.

Original sketch by type designer Ryoko Nishizuka.

The Adobe font, named Source Han Sans, is a new open source offering for the company’s Pan-CJK typeface family.

Google is simultaneously releasing its own version of this font under the name Noto Sans CJK as part of a plan to build out its Noto Pan-Unicode font family. Both sets, developed in collaboration, are identical except for the name and will serve 1.5 billion people — roughly a quarter of the world’s population.

The new typeface family is available in seven weights, supporting Japanese, Korean, Traditional Chinese and Simplified Chinese, all in one font.

[…]

“The design is relatively modern in style, but it has simple strokes and is monolinear so it makes text clear and readable on small devices such as tablets and smartphones,” said Nicole Minoza, Adobe’s product marketing manager.

“Because it’s a sans serif typeface, it’s a workhorse font — good for a single line of text or a short phrase or something you might see in a software menu, as well as longer strings of text that would appear in an ebook or a printed publication.”

[…]

Each font weight in the family has a total of 65,535 glyphs (the maximum number of characters supported in the OpenType format), and the entire family contains just under half a million total glyphs.

[…]

“Not only are the open source fonts free, but users can extend and modify them,” Minoza said. “They would have the right to add Vietnamese characters, for example. Hardware and software manufacturers can install the fonts on their devices. There’s a really big audience and the licensing rights for open source makes it good for device manufacturers.”

[…]

Discussions around creating a Pan-CJK font started about 15 years ago at Adobe, but the company couldn’t get beyond the overall cost in terms of time and resources.

With this joint project, Adobe was able to contribute its design, technical skill, in-country type experience, coordination and automation, while letting Google take control of the logistics for project direction, defining requirements, in-country testing of resources and expertise and funding.

[…]

To make sure the font was authentic for native readers, Adobe sought expertise from foundries such as Iwata Corp. to expand the Japanese glyph selection, Sandoll Communication, designer of Korean Hangul (the Korean language native alphabet) and Changzhou SinoType, Adobe’s longtime collaborator in China.

Each foundry was assigned a different task for a unique contribution to the project. Said Minoza, “Iwata fleshed out the original Japanese design, which was provided to our other partners. Sandoll created the Hangul characters from scratch — and they needed to make sure they harmonized with the other characters as well as with the Latin characters — and SinoType not only had to expand the Chinese glyph sets but they had to analyze each of the glyphs to make sure they satisfied regional considerations.

“There are a lot of instances and regional variations for the characters even though they all evolved from the same character originally.” The new font also features Hong Kong and Taiwanese character sets.

Ryoko Nishizuka, an Adobe senior designer on the Tokyo type team, created the overall type design from which the other language variations are derived.

multi-language-sample-v3