Census

A census is the procedure of systematically acquiring and recording information about the members of a given population. This term is used mostly in connection with national population and housing censuses; other common censuses include traditional culture, business, supplies, and traffic censuses. The United Nations defines the essential features of population and housing censuses as "individual enumeration, universality within a defined territory, simultaneity and defined periodicity", and recommends that population censuses be taken at least every 10 years. United Nations recommendations also cover census topics to be collected, official definitions, classifications and other useful information to co-ordinate international practices.

The word is of Latin origin: during the Roman Republic, the census was a list that kept track of all adult males fit for military service. The modern census is essential to international comparisons of any kind of statistics, and censuses collect data on many attributes of a population, not just how many people there are. Censuses typically began as the only method of collecting national demographic data, and are now part of a larger system of different surveys. Although population estimates remain an important function of a census, including exactly the geographic distribution of the population, statistics can be produced about combinations of attributes e.g. education by age and sex in different regions. Current administrative data systems allow for other approaches to enumeration with the same level of detail but raise concerns about privacy and the possibility of biasing estimates.

A census can be contrasted with sampling in which information is obtained only from a subset of a population; typically main population estimates are updated by such intercensal estimates. Modern census data are commonly used for research, business marketing, and planning, and as a baseline for designing sample surveys by providing a sampling frame such as an address register. Census counts are necessary to adjust samples to be representative of a population by weighting them as is common in opinion polling. Similarly, stratification requires knowledge of the relative sizes of different population strata which can be derived from census enumerations. In some countries, the census provides the official counts used to apportion the number of elected representatives to regions (sometimes controversially – e.g., Utah v. Evans). In many cases, a carefully chosen random sample can provide more accurate information than attempts to get a population census.

Sampling
A census is often construed as the opposite of a sample as its intent is to count everyone in a population rather than a fraction. However, population censuses rely on a sampling frame to count the population. This is the only way to be sure that everyone has been included as otherwise those not responding would not be followed up on and individuals could be missed. The fundamental premise of a census is that the population is not known and a new estimate is to be made by the analysis of primary data. The use of a sampling frame is counterintuitive as it suggests that the population size is already known. However, a census is also used to collect attribute data on the individuals in the nation. This process of sampling marks the difference between historical census, which was a house to house process or the product of an imperial decree, and the modern statistical project. The sampling frame used by census is almost always an address register. Thus it is not known if there is anyone resident or how many people there are in each household. Depending on the mode of enumeration, a form is sent to the householder, an enumerator calls, or administrative records for the dwelling are accessed. As a preliminary to the dispatch of forms, census workers will check any address problems on the ground. While it may seem straightforward to use the postal service file for this purpose, this can be out of date and some dwellings may contain a number of independent households. A particular problem is what are termed 'communal establishments' which category includes student residences, religious orders, homes for the elderly, people in prisons etc. As these are not easily enumerated by a single householder, they are often treated differently and visited by special teams of census workers to ensure they are classified appropriately.

Residence definitions
Individuals are normally counted within households, and information is typically collected about the household structure and the housing. For this reason international documents refer to censuses of population and housing. Normally the census response is made by a household, indicating details of individuals resident there. An important aspect of census enumerations is determining which individuals can be counted and which cannot be counted. Broadly, three definitions can be used: de facto residence; de jure residence; and permanent residence. This is important in considering individuals who have multiple or temporary addresses. Every person should be identified uniquely as resident in one place; but the place where they happen to be on Census Day, their de facto residence, may not be the best place to count them. Where an individual uses services may be more useful, and this is at their usual residence. An individual may be recorded at a "permanent" address, which might be a family home for students or long term migrants.

A precise definition of residence is needed, to decide whether visitors to a country should be included in the population count. This is becoming more important as students travel abroad for education for a period of several years. Other groups causing problems of enumeration are new-born babies, refugees, people away on holiday, people moving home around census day, and people without a fixed address.

People with second homes because they are working in another part of the country or have a holiday cottage are difficult to fix at a particular address; this sometimes causes double counting or houses being mistakenly identified as vacant. Another problem is where people use a different address at different times e.g. students living at their place of education in term time but returning to a family home during vacations, or children whose parents have separated who effectively have two family homes. Census enumeration has always been based on finding people where they live, as there is no systematic alternative: any list you could use to find people is likely to be derived from census activities in the first place. Recent UN guidelines provide recommendations on enumerating such complex households.

Enumeration strategies
Historical censuses used crude enumeration assuming absolute accuracy. Modern approaches take into account the problems of overcount and undercount, and the coherence of census enumerations with other official sources of data. This reflects a realist approach to measurement, acknowledging that under any definition of residence there is a true value of the population but this can never be measured with complete accuracy. An important aspect of the census process is to evaluate the quality of the data.

Many countries use a post-enumeration survey to adjust the raw census counts. This works in a similar manner to capture-recapture estimation for animal populations. Among census experts this method is called dual system enumeration (DSE). A sample of households are visited by interviewers who record the details of the household as at census day. These data are then matched to census records, and the number of people missed can be estimated by considering the numbers of people who are included in one count but not the other. This allows adjustments to the count for non-response, varying between different demographic groups. An explanation using a fishing analogy can be found in "Trout, Catfish and Roach..." which won an award from the Royal Statistical Society for excellence in official statistics in 2011. Triple system enumeration has been proposed as an improvement as it would allow evaluation of the statistical dependence of pairs of sources. However, as the matching process is the most difficult aspect of census estimation this has never been implemented for a national enumeration. It would also be difficult to identify three different sources that were sufficiently different to make the triple system effort worthwhile. The DSE approach has another weakness in that it assumes there is no person counted twice (over count). In de facto residence definitions this would not be a problem but in de jure definitions individuals risk being recorded on more than one form leading to double counting. A particular problem here is students who often have a term time and family address.

Several countries have used a system which is known as short form/long form. This is a sampling strategy which randomly chooses a proportion of people to send a more detailed questionnaire to (the long form). Everyone receives the short form questions. This means more data are collected, but without imposing a burden on the whole population. This also reduces the burden on the statistical office. Indeed, in the UK until 2001 all residents were required to fill in the whole form but only a 10% sample were coded and analysed in detail. New technology means that all data are now scanned and processed. Recently there has been controversy in Canada about the cessation of the long form; the head, Munir Sheikh, resigned.

The use of alternative enumeration strategies is increasing but these are not as simple as many people assume, and are only used in developed countries. The Netherlands has been most advanced in adopting a census using administrative data. This allows a simulated census to be conducted by linking several different administrative databases at an agreed time. Data can be matched and an overall enumeration established allowing for discrepancies between different data sources. A validation survey is still conducted in a similar way to the post enumeration survey employed in a traditional census.

Other countries which have a population register use this as a basis for all the census statistics needed by users. This is most common among Nordic countries, but requires a large number of different registers to be combined, including population, housing, employment and education. These registers are then combined and brought up to the standard of a statistical register by comparing the data in different sources and ensuring the quality is sufficient for official statistics to be produced. A recent innovation is the French instigation of a rolling census programme with different regions enumerated each year, so that the whole country is completely enumerated every 5 to 10 years. In Europe, in connection with the 2010 census round, a large number of countries adopted alternative census methodologies, often based on the combination of data from registers, surveys and other sources.

Technology
Censuses have evolved in their use of technology: censuses in 2010 used many new types of computing. In Brazil, handheld devices were used by enumerators to locate residences on the ground. In many countries, census returns could be made via the Internet as well as in paper form. DSE is facilitated by computer matching techniques which can be automated, such as propensity score matching. In the UK, all census formats are scanned and stored electronically before being destroyed, replacing the need for physical archives. The record linking to perform an administrative census would not be possible without large databases being stored on computer systems.

There are sometimes problems in introducing new technology. The US census had been intended to use handheld computers, but cost escalated and this was abandoned, with the contract being sold to Brazil. Online response has some advantages, but one of the functions of the census is to make sure everyone is counted accurately. A system which allowed people to enter their address without verification would be open to abuse. Therefore, households have to be verified on the ground, typically by an enumerator visit or post out. Paper forms are still necessary for those without access to the internet. It is also possible that the hidden nature of an administrative census means that users are not engaged with the importance of contributing their data to official statistics.

Alternatively, population estimations may be carried out remotely with GIS and remote sensing technologies.

Census and development
According to UNFPA, "The information generated by a population and housing census – numbers of people, their distribution, their living conditions and other key data – is critical for development." This is because this type of data is essential for policymakers so that they know where to invest. Unfortunately, many countries have outdated or inaccurate data about their populations and thus have difficulty in addressing the needs of the population.

UNFPA said:

"The unique advantage of the census is that it represents the entire statistical universe, down to the smallest geographical units, of a country or region. Planners need this information for all kinds of development work, including: assessing demographic trends; analysing socio-economic conditions; designing evidence-based poverty-reduction strategies; monitoring and evaluating the effectiveness of policies; and tracking progress toward national and internationally agreed development goals."

In addition to making policymakers aware of population issues, the census is also an important tool for identifying forms of social, demographic or economic exclusions, such as inequalities relating to race, ethics and religion as well as disadvantaged groups such as those with disabilities and the poor.

An accurate census can empower local communities by providing them with the necessary information to participate in local decision-making and ensuring they are represented.

Uses of census data
Early censuses in the 19th century collected paper documents which had to be collated by hand, so the statistical information obtained was quite basic. The government owned the data could publish statistics on the state of the nation. The results were used to measure changes in the population and apportion representation. Population estimates could be compared to those of other countries.

By the beginning of the 20th century, censuses were recording households and some indications of their employment. In some countries, census archives are released for public examination after many decades, allowing genealogists to track the ancestry of interested people. Archives provide a substantial historical record which may challenge established views. Information such as job titles and arrangements for the destitute and sick may also shed light on the historical structure of society.

Political considerations influence the census in many countries. In Canada in 2010 for example, the government under the leadership of Stephen Harper abolished the mandatory long-form census. This abolition was a response to protests from some Canadians who resented the personal questions. The long-form census was reinstated by the Justin Trudeau government in 2016.

Census data and research
As governments assumed responsibility for schooling and welfare, large government research departments made extensive use of census data. Population projections could be made, to help plan for provision in local government and regions. Central government could also use census data to allocate funding. Even in the mid 20th century, census data was only directly accessible to large government departments. However, computers meant that tabulations could be used directly by university researchers, large businesses and local government offices. They could use the detail of the data to answer new questions and add to local and specialist knowledge.

Nowadays, census data are published in a wide variety of formats to be accessible to business, all levels of government, media, students and teachers, charities and any citizen who is interested; researchers in particular have an interest in the role of Census Field Officers (CFO) and their assistants. Data can be represented visually or analysed in complex statistical models, to show the difference between certain areas, or to understand the association between different personal characteristics. Census data offer a unique insight into small areas and small demographic groups which sample data would be unable to capture with precision.

Privacy
Although the census provides useful statistical information about a population, the availability of this information can sometimes lead to abuses, political or otherwise, by the linking of individuals' identities to anonymous census data. This is particularly important when individuals' census responses are made available in microdata form, but even aggregate-level data can result in privacy breaches when dealing with small areas and/or rare subpopulations.

For instance, when reporting data from a large city, it might be appropriate to give the average income for black males aged between 50 and 60. However, doing this for a town that only has two black males in this age group would be a breach of privacy because either of those persons, knowing his own income and the reported average, could determine the other man's income.

Typically, census data are processed to obscure such individual information. Some agencies do this by intentionally introducing small statistical errors to prevent the identification of individuals in marginal populations; others swap variables for similar respondents. Whatever is done to reduce the privacy risk, new improved electronic analysis of data can threaten to reveal sensitive individual information. This is known as statistical disclosure control.

Another possibility is to present survey results by means of statistical models in the form of a multivariate distribution mixture. The statistical information in the form of conditional distributions (histograms) can be derived interactively from the estimated mixture model without any further access to the original database. As the final product does not contain any protected microdata, the model-based interactive software can be distributed without any confidentiality concerns.

Another method is simply to release no data at all, except very large scale data directly to the central government. Different release strategies between government have led to an international project (IPUMS) to co-ordinate access to microdata and corresponding metadata. Such projects such as SDMX also promote standardising metadata, so that best use can be made of the minimal data available.

Egypt
Censuses in Egypt first appeared in the late Middle Kingdom and developed in the New Kingdom Pharaoh Amasis, according to Herodotus, required every Egyptian to declare annually to the nomarch, "whence he gained his living". Under the Ptolemies and the Romans several censuses were conducted in Egypt by government officials

Ancient Greece
There are several accounts of ancient Greek city states carrying out censuses.

Israel
Censuses are mentioned in the Bible. God commands a per capita tax to be paid with the census in for the upkeep of the Tabernacle. The Book of Numbers is named after the counting of the Israelite population (in ) according to the house of the Fathers after the exodus from Egypt. A second census was taken while the Israelites were camped in the plains of Moab, in.

King David performed a census that produced disastrous results (in and ). His son, King Solomon, had all of the foreigners in Israel counted in.

When the Romans took over Judea in 6, the legate Publius Sulpicius Quirinius organised a census for tax purposes. The Gospel of Luke links the birth of Jesus to this event. .

China
One of the world's earliest preserved censuses was held in China in 2 during the Han Dynasty, and is still considered by scholars to be quite accurate. The population was registered as having 57,671,400 individuals in 12,366,470 households. Another census was held in 144.

India
The oldest recorded census in India is thought to have occurred around 330 during the reign of Emperor Chandragupta Maurya under the leadership of Kautilya or Chanakya and Ashoka.

Rome
The word census originated in ancient Rome from the Latin word  ("to estimate"). The census played a crucial role in the administration of the Roman Empire, as it was used to determine taxes. With few interruptions, it was usually carried out every five years. It provided a register of citizens and their property from which their duties and privileges could be listed. It is said to have been instituted by the Roman king Servius Tullius in the 6th century, at which time the number of arms-bearing citizens was supposedly counted at around 80,000. The 6 "census of Quirinius" undertaken following the imposition of direct Roman rule in Judea was partially responsible for the development of the Zealot movement and several failed rebellions against Rome that ended in the Diaspora. The 15-year indiction cycle established by Diocletian in 297 was based on quindecennial censuses and formed the basis for dating in late antiquity and under the Byzantine Empire.

Rashidun and Umayyad Caliphates
In the Middle Ages, the Caliphate began conducting regular censuses soon after its formation, beginning with the one ordered by the second Rashidun caliph, Umar.

Medieval Europe
The Domesday Book was undertaken in 1086 by William I of England so that he could properly tax the land he had recently conquered in medieval Europe. In 1183, a census was taken of the crusader Kingdom of Jerusalem, to ascertain the number of men and amount of money that could possibly be raised against an invasion by Saladin, sultan of Egypt and Syria.

1328 : First national census of France (L'État des paroisses et des feux) mostly for fiscal purposes. It estimated the French population at 16 to 17 millions.

Inca Empire
In the 15th century, the Inca Empire had a unique way to record census information. The Incas did not have any written language but recorded information collected during censuses and other numeric information as well as non-numeric data on quipus, strings from llama or alpaca hair or cotton cords with numeric and other values encoded by knots in a base-10 positional system.

Spanish Empire
On May 25, 1577, King Philip II of Spain ordered by royal cédula the preparation of a general description of Spain's holdings in the Indies. Instructions and a questionnaire, issued in 1577 by the Office of the Cronista Mayor, were distributed to local officials in the Viceroyalties of New Spain and Peru to direct the gathering of information. The questionnaire, composed of fifty items, was designed to elicit basic information about the nature of the land and the life of its peoples. The replies, known as "relaciones geográficas", were written between 1579 and 1585 and were returned to the Cronista Mayor in Spain by the Council of the Indies.

World population estimates
The earliest estimate of the world population was made by Giovanni Battista Riccioli in 1661; the next by Johann Peter Süssmilch in 1741, revised in 1762; the third by Karl Friedrich Wilhelm Dieterici in 1859.

In 1931, Walter Willcox published a table in his book, International Migrations: Volume II Interpretations, that estimated the 1929 world population to be roughly 1.8 billion.