Márton Szabó | 15 Feb 2012 15:39
Picon

IMDB database creation

Hello Guys, 

First of all thanks for IMDbPY it works great. 
I have a question though and it may not be related as close to IMDbPY as it should be to use this mailing list, but I just can get over it. 

I wanted to play with the IMDB UNIX search programs, that can be found here: http://www.imdb.com/interfaces
I was able to compile the programs, but I couldn't create the databases. 
Lots of .list files failed to convert to .data/.names/.titles files with error codes like these:

/usr/local/moviedb-3.24/etc/mkdb -movie Adding Movies List... make[3]: *** [movies.data] Bus error

/usr/local/moviedb-3.24/etc/mkdb -acr Adding Actors... mkdb: too many titles -- increase MAXTITLES make[3]: *** [actors.data] Error 255

/usr/local/moviedb-3.24/etc/mkdb -acs Adding Actresses... make[3]: *** [actresses.data] Segmentation fault make[3]: *** Deleting file `actresses.data'

I thought this can be traced back to the stack size limit on mac os x, which is limited by default and can be maxed to only 64MB. 
So, I tried on Ubuntu setting the stack size to unlimited, but I ran into the same problem.  Tried other forums for advice, but didn't succeed.

Do you guys have any idea how to create these databases?

Thanks!

Regards, 
Márton
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Imdbpy-help mailing list
Imdbpy-help@...
https://lists.sourceforge.net/lists/listinfo/imdbpy-help
Davide Alberani | 15 Feb 2012 19:58
Picon
Gravatar

Re: IMDB database creation

On Wed, Feb 15, 2012 at 15:39, Márton Szabó <habakukk1@...> wrote:
>
> First of all thanks for IMDbPY it works great.

Thanks. :-)

> I wanted to play with the IMDB UNIX search programs, that can be found here:
> http://www.imdb.com/interfaces

Wow!  They are a little... demodé. :-)
I don't really expect them to work on a recent set of data:
since some time all the titles in the plain text data files
are listed in the "The Title" format, while previously they
were "Title, The".

We supported the output of moviedb up to IMDbPY 4.1,
after that the changes were so many that it was no more
worth the effort (and the 'sql' method works much better).

> I was able to compile the programs, but I couldn't create the databases.
> Lots of .list files failed to convert to .data/.names/.titles files with
> error codes like these:
>
> /usr/local/moviedb-3.24/etc/mkdb  -movie
> Adding Movies List...
> make[3]: *** [movies.data] Bus error

Here's the most important excerpt from my old README.local:
====================================================
NOTE: the current (3.24) moviedb version is old an it was not
thought with tv series episodes support in mind.
It can still work very well, but you've to modify some constants
in the code: edit the "moviedb.h" file in the "src" directory,
and change MAXTITLES to _at least_ 1600000, MAXNAKAENTRIES
to 700000, MAXFILMOGRAPHIES to 20470, LINKSTART to 1000000
and MAXBIOENTRIES to 500000.
Also, setting MXLINELEN to 1023 is a good idea.
See http://us.imdb.com/database_statistics for more up-to-date
statistics.
====================================================

You can read a complete copy here:
http://erlug.linux.it/~da/erlugtmp/README.local
The current version:
https://bitbucket.org/alberanid/imdbpy/src/74e6f583f9cf/docs/README.local

If you need the other tools we developed to use these data, you
can download IMDbPY 4.1 from
http://sourceforge.net/projects/imdbpy/files/IMDbPY/4.1/
but I guess that at this point you've changed your mind. :-P

HTH,
--

-- 
Davide Alberani <davide.alberani@...>  [PGP KeyID: 0x465BFD47]
http://www.mimante.net/

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Márton Szabó | 16 Feb 2012 15:55
Picon

Re: IMDB database creation

Wow, you are awesome! I was able to create the databases, although not every program works with every
database (I can't list directors, but I can list actors as directors. weird...) I guess it's because the
outdated programs. 
Maybe I will dig into your SQL version. 
Thanks for your help!

Regards, 
marchello

On Feb 15, 2012, at 7:58 PM, Davide Alberani wrote:

> On Wed, Feb 15, 2012 at 15:39, Márton Szabó <habakukk1@...> wrote:
>> 
>> First of all thanks for IMDbPY it works great.
> 
> Thanks. :-)
> 
>> I wanted to play with the IMDB UNIX search programs, that can be found here:
>> http://www.imdb.com/interfaces
> 
> Wow!  They are a little... demodé. :-)
> I don't really expect them to work on a recent set of data:
> since some time all the titles in the plain text data files
> are listed in the "The Title" format, while previously they
> were "Title, The".
> 
> We supported the output of moviedb up to IMDbPY 4.1,
> after that the changes were so many that it was no more
> worth the effort (and the 'sql' method works much better).
> 
>> I was able to compile the programs, but I couldn't create the databases.
>> Lots of .list files failed to convert to .data/.names/.titles files with
>> error codes like these:
>> 
>> /usr/local/moviedb-3.24/etc/mkdb  -movie
>> Adding Movies List...
>> make[3]: *** [movies.data] Bus error
> 
> Here's the most important excerpt from my old README.local:
> ====================================================
> NOTE: the current (3.24) moviedb version is old an it was not
> thought with tv series episodes support in mind.
> It can still work very well, but you've to modify some constants
> in the code: edit the "moviedb.h" file in the "src" directory,
> and change MAXTITLES to _at least_ 1600000, MAXNAKAENTRIES
> to 700000, MAXFILMOGRAPHIES to 20470, LINKSTART to 1000000
> and MAXBIOENTRIES to 500000.
> Also, setting MXLINELEN to 1023 is a good idea.
> See http://us.imdb.com/database_statistics for more up-to-date
> statistics.
> ====================================================
> 
> You can read a complete copy here:
> http://erlug.linux.it/~da/erlugtmp/README.local
> The current version:
> https://bitbucket.org/alberanid/imdbpy/src/74e6f583f9cf/docs/README.local
> 
> If you need the other tools we developed to use these data, you
> can download IMDbPY 4.1 from
> http://sourceforge.net/projects/imdbpy/files/IMDbPY/4.1/
> but I guess that at this point you've changed your mind. :-P
> 
> 
> HTH,
> -- 
> Davide Alberani <davide.alberani@...>  [PGP KeyID: 0x465BFD47]
> http://www.mimante.net/

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Davide Alberani | 16 Feb 2012 20:58
Picon
Gravatar

Re: IMDB database creation

On Thu, Feb 16, 2012 at 15:55, Márton Szabó <habakukk1@...> wrote:
>
> Wow, you are awesome! I was able to create the databases, although not every program works with every
database (I can't list directors, but I can list actors as directors. weird...) I guess it's because the
outdated programs.

Yep.
The C code is clean enough to be understood and
modified easily, but I fear it's not worth the effort. :-/

> Maybe I will dig into your SQL version.

Makes sense. :-)

--

-- 
Davide Alberani <davide.alberani@...>  [PGP KeyID: 0x465BFD47]
http://www.mimante.net/

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/

Gmane