Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 9 of 9 total
Thread Way OT : Surname Database for ethnic coding
Thu, May 25 2006 10:19 PMPermanent Link

Dondi
Gentlemen :

I have been doing the type of coding described on this web page
http://w1.melissadata.com/listservices/ethniccoder.htm
with a program they no longer offer. And no one at the company I spoke
to even knows of its prior existence. They will only provide the
service. Go figure.

I have searched the web and was unable to find either a database or a
program that I can use to do ethnic coding.

I spoke to the people at Ethnic Technologies and suffice it to say that
1/4 million dollars is out of my league.


Any ideas, leads or suggestions greatly appreciated.

Later . . .
Dom
Tue, May 30 2006 8:14 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dom,

<< I have searched the web and was unable to find either a database or a
program that I can use to do ethnic coding. >>

It's mainly an algorithm, is it not ?  There can't be that many different
ethnicities that would require a very large database.

<< I spoke to the people at Ethnic Technologies and suffice it to say that
1/4 million dollars is out of my league. >>

Wow !!!  I mean, wow !!!!  I'm in the completely wrong business.  They must
have 3 *really good* customers. Smiley

Sorry I couldn't be of any actual help. Smiley

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, May 30 2006 9:16 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Dondi


If you've been doing it doesn't that mean you already have a starter for your own database? As Tim says it can't be that difficult. They don't ask for much information, and suggest that they base their guess on surname.

Why not start with what you've got. New surnames will require manual intervention but you should be able to build up a fairly good database reasonably quickly.

Roy Lambert
Tue, May 30 2006 5:13 PMPermanent Link

Dom Masters
Tim,

See response below.

Dom
Tue, May 30 2006 8:46 PMPermanent Link

Dondi
Roy and Tim,

<< If you've been doing it doesn't that mean you already
have a starter for your own database? As Tim says it can't be that
difficult. >>

I agree, it would appear that way. Sometimes the easy things turn out to
be the most difficult to resolve.

<< They don't ask for much information, and suggest that they base
their guess on surname. >>

Allow me to briefly elaborate on what I am doing. I have a database of
over 3 million registered voters residing in New York City, an ethically
and racially diverse place.

While it is possible to identify ethnicity and race by surnames,
sometimes it is not that straight forward. The Census Bureau has a
Hispanic / Latino surname list and that simplifies things for that group.

Surnames like Lambert and Young are not so simple. Is Lambert French or
French Canadian or just a plain old WASP? SmileyAre Young and Lambert
white or black? What we do in these cases is go to the Census Data and
see what the population figures for that zip code are. Thus, the request
for the address. If the population for that zip code is 80% black then
the probability is that these individuals are black. If the there is
still some ambiguity at the zip code level then I continue to drill down
to the census tract level or even further if necessary. Each time the
new census figures are released, this has to be done again.

They provide the coding as a service for $15 per thousand records ( you
do the math ) still an expensive proposition. Furthermore I am not
giving up my database to anyone.

<< Wow !!!  I mean, wow !!!!  I'm in the completely wrong business. >>

Tell me about it !!!

<< They must have 3 *really good* customers. Smiley>>

Actually, when you think about it there is a big demand for this kind of
service in corporate America. Supermarket chains, movie studios, radio
stations etc... want to know what ethnic or racial groups purchase their
products. And, like me, do not want to give up their database to third
party vendors.

Thanks for your responses.

Dom
Wed, May 31 2006 12:22 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dondi,

<< Surnames like Lambert and Young are not so simple. Is Lambert French or
French Canadian or just a plain old WASP? SmileyAre Young and Lambert white or
black? What we do in these cases is go to the Census Data and see what the
population figures for that zip code are. Thus, the request for the address.
If the population for that zip code is 80% black then the probability is
that these individuals are black. If the there is still some ambiguity at
the zip code level then I continue to drill down to the census tract level
or even further if necessary. Each time the new census figures are released,
this has to be done again. >>

Well, Young in my case is adopted, so I could be Martian for all I know. Smiley
But yeah, I get what you're saying here and it is much more complicated than
it seems at first blush.  It's actually a cross-reference of several large
databases in order to derive a "best guess".

<< They provide the coding as a service for $15 per thousand records ( you
do the math ) still an expensive proposition. Furthermore I am
not giving up my database to anyone. >>

Ahh, I misunderstood you to mean that they were charging a flat fee.   $15
per thousand isn't so bad unless you've got a ton of records, which in this
case you do. Frown

<< Actually, when you think about it there is a big demand for this kind of
service in corporate America. Supermarket chains, movie studios, radio
stations etc... want to know what ethnic or racial groups purchase their
products. And, like me, do not want to give up their database to third party
vendors. >>

Yes, I'm sure there are plenty of businesses that want that information.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, May 31 2006 2:05 PMPermanent Link

>  Are Young and Lambert white or black?
[snip]
> If the population for that zip code is 80%
> black then the probability is that these individuals are black.

Thats something that I'd have to question, while admitting that I may be
wrong. I'd want to do some sort of verification somehow. Because if you
assume that a name that has WASP origins is likely to have changed in a
predominantly different ethnic area, then what is the 20% going to be made
of? For example, let's say that 80% of the people in an area have the name
"black", and that is usually an indicator of them being black. I now have
a name which is "white". Should I assume that because they are in this
area they are probably black, or go with the probability that they are
part of the 20% of different origin and white?

I can see why the cost of doing this would be high if you want any sort of
accuracy.

/Matthew Jones/
Sat, Jun 3 2006 12:58 AMPermanent Link

Dondi
Matthew,

<< Thats something that I'd have to question, while admitting that I may be
wrong. I'd want to do some sort of verification somehow. Because if you
assume that a name that has WASP origins is likely to have changed in a
predominantly different ethnic area, then what is the 20% going to be made
of? >>

There really isn't any right or wrong here, unless you are opposed to
ethnic and racial profiling and you would not be alone in that position.
Politics is still an arena where it is perfectly acceptable to do
profiling and lots of people make lots of money along the way.

In the 50's and 60's there was a popular country singer as "white" as
can be named Brenda Lee. Regardless of where she resides she probably
ends up categorized as an Asian.

Not everyone who resides in "China Town" is Chinese, not everyone who
resides in "Harlem" is black and not everyone who resides in "Spanish
Harlem" is Latino.

<< I can see why the cost of doing this would be high if you want any
sort of
accuracy. >>

While accuracy is desirable, in my opinion, they charge so much because
people are reluctant to give up the lists to "strangers". The biggest
users of this "technology" is the medical profession as more and more
maladies are being identified with particular ethnic and racial groups.
They need to do this identification "in house".

DBISAM is very well suited for this type of endeavor. Want to give it a
try? Smiley

Thanks for your very insightful thoughts.

Dom
Sat, Jun 10 2006 9:21 AMPermanent Link

James Margarit
Dondi wrote:

> ...Allow me to briefly elaborate on what I am doing. I have a database of
> over 3 million registered voters residing in New York City, an *ethically*
> and racially diverse place...
>

Amen to that brother and good luck divining ethics from surnames Wink

Let me know what the database says about the surnames of New York
residents that might be running for president in a few years...

Jim
Image