Review: Add hash extract feature to libwapcaplet

Daniel Silverstone dsilvers at
Sat Jun 20 15:01:27 BST 2009

On Sat, 2009-06-20 at 14:48 +0100, Daniel Silverstone wrote:
> Changed files
>  include/libwapcaplet/libwapcaplet.h |   15 +++++++++++++++
>  src/libwapcaplet.c                  |    7 +++++++
>  2 files changed, 22 insertions(+)

I note there's no test for this change. It adds a feature which means it
needs tests. Specifically one which ensures that a NULL lwc_string
argument causes an assert, and one which ensures that a valid lwc_string
does not.

I can't decide if a test to demonstrate that the hash value is different
for two different strings is worthwhile since, of course, it's quite
possible to produce two strings which hash to the same value.

> Index: include/libwapcaplet/libwapcaplet.h
> +/**
> + * Retrieve the hask value of the string.
> + *
> + * @note The API should be used as a convenient way of get the underlaid
> + * 	 string's hash value. And the hash value should never be used to
> + *	 identified the string, some code as:
> + *
> + *	 if (hashvalue == 0xb1ab1a00) {
> + *  		assume string == "div"
> + *  	 }
> + *  	 
> + *       should never appear. The hash value is suitable for a hash table
> + *	 key.
> + */
> +extern uint32_t lwc_string_hash_value(lwc_string *str);

This documentation comment is a good start, but is perhaps a little
over-zealous. I'm happy that you have taken on board my issue with the
presence of the function, but perhaps the following might be better:

 * Retrieve (or compute if unavailable) a hash value for the content of the string.
 * @note This API should only be used as a convenient way to retrieve a hash
 *       value for the string. This hash value should not be relied on to be
 *       unique within an invocation of the program, nor should it be relied upon
 *       to be stable between invocations of the program. Never use the hash
 *       value as a way to directly identify the value of the string.

There is no need to give an example of *BAD* code, because people will
remember the coding idiom and forget that it's a bad idiom.

Eventually libwapcaplet's doc comments need extending to better document
their arguments and returns, but for now you have fitted into the style
in the header well.
> Index: src/libwapcaplet.c

> +uint32_t lwc_string_hash_value(lwc_string *str)

The style in libwapcaplet.c calls for the name of the function to be on
a newline at the start:

lwc_string_hash_value(lwc_string *str)

So that we can easily search for the function by regex-searching for

So, overall, it's fine as an idea. Please update the documentation
comment as per above, fix the code style issue in libwapcaplet.c and add
some tests. If you need help with adding the tests then let me know.


Daniel Silverstone             
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895

More information about the netsurf-dev mailing list