September, 2008

RE: from Hanna

I’d like you to meet Hanna, a charming blue-eyed blonde, brunette with brown eyes.

Did you get the “mistake” in the previous sentence? Without reading it twice?

Today, the following mail made it past our spamassassin installation:

I am a charming blue-eyed blonde, brunette with brown eyes, and I'm
looking for an intelligent man to communicate by e-mail, Skype, or on
real dates!

Write me a message by email: Hanna@superflh.com

I find it very interesting to see spam mails like this appear. Nothing is being
sold here, no links are being made to commercial websites, no attachments came
with it, … and it’s clearly an autogenerated text, meant to make sense in a
way. So what is the purpose?

I guess they are trying to hit the so-called self-learning (mostly bayesian)
spam-filters at their weakest point: that they are but computers. In many
situations, mails like this will be manually marked as spam, and this make it
into their learning system as such. Many of these mails eventually lead to
spam-filters marking legitimate mails as spam, which for many people is totally
unacceptable. And those people might just turn off their spam filters because
of that. They “don’t work” anyway, right?

Similar things happen
elsewhere too, apparently, trying to fool Mollom.

Does this reasoning make sense?

Anyway, Hanna does seem interesting to me. Let’s drop her a line…
;-)

Posted in Life, the Universe, and Everything   No Comments »

Google pagination rounding

Hmm, I just found out something strange.

  • Click this link: http://www.google.com/search?hl=en&safe=off&q=%22mieke+van+loon%22&start=20&sa=N
  • How many pages with search results does Google return? (6)
  • How many search results? (1,330)
  • Now click on Next at the bottom
  • How many pages with search results does Google return? (5)
  • How many search results? (1,330)
  • Now click on Next at the bottom again
  • How many pages with search results does Google return? (5)
  • How many search results? (46)

Can somebody explain?

Doesn’t this make all those “proofs” based on Google fights moot? I can understand
Google is not counting exactly how many results a search query
resulted in for optimisations sake, but going from 1,330 to 46… that’s a but
far off, isn’t it?

(For the record, Mieke Van Loon is a member of my family and I was
looking up her personal web page.)

Posted in Life, the Universe, and Everything   No Comments »

pyfconfig — how to get ifconfig data without regular expressions

After reading this post
about ifconfig output parsing by Kris
, I remembered I once needed a
cross-platform way to get the IP address of an interface in Python.

Of course I could just parse the output of `ifconfig`, but I really don’t like
such ugly hacks. I guess I’ve got too many (bad) experiences with libwww-perl scripts I wrote
for web harvesting various stuff. Basically, this is output parsing too, as
HTML is generally the result of some server side script. Each time the webpage
changed the way it looked (non-CSS changes) or worked, my scripts started to
fail.

That’s when I decided I’ll always try to avoid such clumsy dependencies on
third party software.

So, back to the Python question. I set out for a
short adventure on comp.lang.python
and came up with a solution
after some fiddling: pyfconfig, a cross platform Python module to query for the
IP address of an interface. Tested on FreeBSD x86, GNU/Linux x86 and GNU/Linux
x86_64, with Python 2.4 and Python 2.5. Works just fine.


#include "Python.h"

#include
#include
#include
#include
#include
#include
#include
#include
#include

// parameters: string (interface name)
// output: string (ip address of interface in decimal notation)
PyObject * ipaddr(PyObject *self, PyObject *args) {

char ip[ 200 ];
char *itf;

if (! PyArg_ParseTuple(args, "s", &itf)) {
PyErr_SetString(PyExc_Exception, "no interface given!");
return NULL;
}

struct ifaddrs *ifa = NULL, *ifp = NULL;

if (getifaddrs (&ifp) < 0) {
perror ("getifaddrs");
return NULL;
}

for (ifa = ifp; ifa; ifa = ifa->ifa_next) {
socklen_t salen;

if (ifa->ifa_addr->sa_family == AF_INET)
salen = sizeof (struct sockaddr_in);
else if (ifa->ifa_addr->sa_family == AF_INET6)
salen = sizeof (struct sockaddr_in6);
else
continue;

if (strncmp(ifa->ifa_name, itf, sizeof(itf))) {
continue;
}

if (getnameinfo (ifa->ifa_addr, salen, ip, sizeof (ip), NULL, 0, NI_NUMERICHOST) < 0) {
perror ("getnameinfo");
continue;
}
break;
}

freeifaddrs (ifp);

return Py_BuildValue("s", ip);
}

static PyMethodDef pyfconfig_methods[] = {
{"ipaddr", (PyCFunction)ipaddr, METH_VARARGS, "ipaddr(string)\n"},
{NULL, NULL, 0, NULL}
};

DL_EXPORT(void) initpyfconfig(void) {
Py_InitModule3("pyfconfig", pyfconfig_methods, "Provides a function to get an ip address of a certain interface.\n");
}

Compile with gcc -fPIC -shared -Wl -soname pyfconfig.so -o pyfconfig.so -I/usr/include/python2.5/ pyfconfig.c (for Python 2.5) and after an
import pyfconfig, pyfconfig.ipaddr('lo') should return '127.0.0.1' (YMMV).

No dependency on the output formatting of ifconfig. Less bugs.

Posted in Open Source Adventures, scripting   No Comments »

SEO

I have been blogging (infrequently) for only about seven months now and I must
say I got more reactions already than I thought I would ever have. Still,
visitor statistics and Google showups remained fairly low. Especially after my
last post, I became quite surprised. When Googling
for “poor man’s NTP”
, the first result returned was Kris Buytaerts reaction.
Moreover, my original
article
was nowhere to be found in the search results.

This made me thinking…

For years now, my webpage is found by the keywords “Gentoo”, “Macbook” and
“ING”, mainly because of some articles I wrote. But none of
my blog posts seemed to get indexed.

I’m not going into detail here, but after 5 minutes of thinking, 5 minutes of
tinkering with my Apache setup (I removed a 301-detour towads my blog) a little
sed-action later (nanoblogger is great!), Google
seems to love me much more.

Try it! :-)

Posted in Open Source Adventures   1 Comment »