Module src/hash/hash-generate-cmph.c

vim:sw=4:sts=4 Convert a text file with (key,value) pairs into a compileable C file. When compiled, it exports one symbol, which is hash_info_{prefix}. It uses the CMPH library available at http://cmph.sourceforge.net/ by Davi de Castro Reis and Fabiano Cupertino Botelho. Currently the following algorithms are supported: FCH, BDZ. Read the generated hash data, read the list of keys the associated value, and write the hash table. Each bucket is 32 bit long and contains: bits contents hash_bits upper bits of the hash value offset_bits offset of the data in the data string length_bits length of the data When "combine" is activated, then the offsets can be in arbitrary order, and the length has to be specified. When it is off, the data offsets are in ascending order, and no length is needed; the end of the data is equal to the start of the next bucket's data. Note that both BDZ and FCH hash algorithms are not order preserving; this means that each key maps to a distinct bucket number, but in an undefined order. Following steps are required: - read all key/data pairs - calculate the bucket number for each key to determine the order - write the data in this order sequentially into a string (no combining), or check for duplicates, write data in any order. - write an index table with part of the hash value, one offset per bucket and optionally a length. Copyright (C) 2007, 2008 Wolfgang Oertl This program is free software and can be used under the terms of the GNU Lesser General Public License version 2.1. You can find the full text of this license here: http://opensource.org/licenses/lgpl-license.php.

Functions

local _compute_sizes () Determine how to allocate the bits of the 32 bits in the buckets.
local _output_buckets (L, ofile) Output the buckets table.
build_hash_table (L, ifile, ofile) Given the already generated hash function, read the list of keys and the associated value, and write the hash table.
local convert_funcnr (nr) Given a CMPH hash function number, convert that to the numbers used within LuaGnome.
local dump_bdz (f) Output the additional data fields specific for the BDZ algorithm.
local dump_fch (ofile) Output the data structure for fch.
local dump_mphf (L, f) Call the appropriate dump function to create C code containing the data of the cmph hash function.
generate_hash_cmph (L, datafile_name, _prefix, ofname) This function is called from gnomedev.c for the generate_hash call.
local get_hash_value (key, keylen) Calculate the first hash value - again, it already happened in cmph_search, but it doesn't return it anywhere.
special_strlen (s) Calculate the string length, but \[0-7]{1,3} is considered as just one character; this is how the C compiler sees it later.


Functions

local _compute_sizes ()
Determine how to allocate the bits of the 32 bits in the buckets. Lowest is the length of the data; this is zero when combining is off. Next is the offset, and lastly in the high bits the high part of the hash value. In file: src/hash/hash-generate-cmph.c line 425
local _output_buckets (L, ofile)
Output the buckets table. Each bucket is 32 bits long and contains part of the hash value, the offset of the associated data and the data length.

Parameters

  • L:
  • ofile:
In file: src/hash/hash-generate-cmph.c line 467
build_hash_table (L, ifile, ofile)
Given the already generated hash function, read the list of keys and the associated value, and write the hash table. Each bucket contains exactly one entry: bytes contents 4 hash value of the name 2 offset of the data in the data string The data string contains the actual data.

Parameters

  • L:
  • ifile:
  • ofile:
In file: src/hash/hash-generate-cmph.c line 566
local convert_funcnr (nr)
Given a CMPH hash function number, convert that to the numbers used within LuaGnome.

Parameters

  • nr:
In file: src/hash/hash-generate-cmph.c line 88
local dump_bdz (f)
Output the additional data fields specific for the BDZ algorithm.

Parameters

  • f:
In file: src/hash/hash-generate-cmph.c line 164
local dump_fch (ofile)
Output the data structure for fch. Required fields: h1, h2, m, b, p1, p2, g

Parameters

  • ofile:
In file: src/hash/hash-generate-cmph.c line 105
local dump_mphf (L, f)
Call the appropriate dump function to create C code containing the data of the cmph hash function. It is written on stdout.

Parameters

  • L:
  • f:

Return value:

0 on success, 1 on error. In file: src/hash/hash-generate-cmph.c line 291
generate_hash_cmph (L, datafile_name, _prefix, ofname)
This function is called from gnomedev.c for the generate_hash call.

Parameters

  • L:
  • datafile_name:
  • _prefix:
  • ofname:
In file: src/hash/hash-generate-cmph.c line 680
local get_hash_value (key, keylen)
Calculate the first hash value - again, it already happened in cmph_search, but it doesn't return it anywhere. This is an unfortunate intrusion into cmph internals!

Parameters

  • key:
  • keylen:
In file: src/hash/hash-generate-cmph.c line 224
special_strlen (s)
Calculate the string length, but \[0-7]{1,3} is considered as just one character; this is how the C compiler sees it later. Note: \n, \t etc. is also detected, although no check for invalid escape sequences is done.

Parameters

  • s:
In file: src/hash/hash-generate-cmph.c line 251

Valid XHTML 1.0!