P.Z. Low Cost CPanel Web Hosting  

Go Back   P.Z. Low Cost CPanel Web Hosting > Page-Zone Web Hosting Main Forum > General FAQ > General Questions and Comments

General Questions and Comments Post your question or grace us with your knowledge. Posting limited to registered members.

Reply
 
LinkBack Thread Tools Rate Thread Display Modes
Accessing raw log files
Old
  (#1 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
Accessing raw log files - 10-08-2005, 01:40 AM

I have written a Php script to run on my home computer that reads a raw apache access log file (in standard format) and loads it into a MySql database. I wrote more scripts that let me search the log file database and present data in a manner meaningful to me.

The one catch is that I have to manually download the raw log file using cPanel. I would like to eliminate this manual step and automatically download my gzipped log file every day with a scheduled task (or have a cron job "push it" to my machine from the server).

Is there any way to do that?

Is there any way to access my site's raw log files from a script running on the server?

Chuck
   
Reply With Quote
Old
  (#2 (permalink))
midwest
blink and it's over
midwest will become famous soon enough
 
midwest's Avatar
 
Status: Offline
Posts: 802
Join Date: Oct 2002
Location: Big Sky, MT
Rep Power: 78
10-08-2005, 07:17 PM

What do you have at home for a platform? win or linux? static or dynamic IP?

edit:

read this
http://us3.php.net/manual/en/ref.ftp.php
then this
http://us3.php.net/manual/en/ref.zlib.php

depending on your setup Zlib might be able to handle it all for you.


Ronnie Gauthier
www.instaguide.com

======================
for official page-zone support please visit
www.page-zone.com/support.shtml
   
Reply With Quote
Old
  (#3 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-09-2005, 01:49 AM

Quote:
Originally Posted by midwest
What do you have at home for a platform? win or linux? static or dynamic IP?
I use Windows XP and I have a dynamic IP (Comcast). It is not very dynamic, though. It only changes every few months, so it is nearly static. I wouldn't mind updating that manually when I get a notice that my script has failed - a couple to a few times a year at most.

Quote:
Originally Posted by midwest
Interesting. I don't think I'll need to do any more than read and write files, so I don't need that level of functionality, ..... but oh no! τΏΤ¬ Now you've done it. .....
..... you may have set me down the path on my next time consuming personal project. I've never even thought of writing my own FTP client, but I can see now that it would be a fun, and fairly simple (yet time consuming) challenge. I would love to be able to use my own ftp client that did all the things I want and the way I want them done.

.... I tend to go off (the deep end?) and write my own apps to do things - - - like the log file database application I am currently spending too much time on.

I'd never seen this extension before, nor even thought about FTP. I like it .... an ftp client that does everything I want it to and the way I want it to. Hmmmmmmmmmm .....

Quote:
Originally Posted by midwest
That looks like an interseting shortcut. I was going to use a Windows gzip binary (on my home computer) with an exec call, but this might be even easier (if I have this extension on my home system).

Thanks for the great pointers.

My problem now, though, is not how to process the files, rather how to get to them. I read an older forum post from Jim saying that the raw log files are not available via FTP. So what I'm really trying to find out is if there is a way for me to access mine from my site on the server so I can make it available for download.

I'll eventually ask in a support ticket if I don't get any answers here.

Thanks,
Chuck
   
Reply With Quote
Old
  (#4 (permalink))
midwest
blink and it's over
midwest will become famous soon enough
 
midwest's Avatar
 
Status: Offline
Posts: 802
Join Date: Oct 2002
Location: Big Sky, MT
Rep Power: 78
10-09-2005, 03:32 AM

I would use curl and start working with
Code:
www.your-domain.com:2082/getaccesslog/accesslog-your-domain.com-10-9-2005.gz
and once that works just use the php date function to break out your current date and format it.


Ronnie Gauthier
www.instaguide.com

======================
for official page-zone support please visit
www.page-zone.com/support.shtml
   
Reply With Quote
Old
  (#5 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-09-2005, 04:10 AM

That works great. Thanks. Looks like I have my work cut out for me. I know how to use curl, so the rest is gravy.

There's more I want to ask about (like knowing how to use cPanel like that), but after a couple of pints (ok, a few) down at the local micro brewery listening to a blue grass session, I think it can wait until tomorrow.

Many thanks,
Chuck
   
Reply With Quote
Old
  (#6 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-09-2005, 06:18 PM

Okay, I've been messing with this a little today.

First - I don't know why the obvious went right by me? Like how to get to the file (duh - look at the status bar in my browser or right-click copy link while in cPanel). I guess I would have figured that out eventually. Not sure why I didn't right away.

But anyway, here are a couple of more things.

1. You need to include username and password when accessing the gzip file, so the link is
username:password@www.your-domain.com:2082/getaccesslog/accesslog-your-domain.com-10-9-2005.gz

2. Is there any reason to use curl instead of simply using fopen, fread to download the gzip file (or use gzopen, gzgets to read line by line)?

Chuck
   
Reply With Quote
Old
  (#7 (permalink))
edwurster
green side up, please!
edwurster will become famous soon enoughedwurster will become famous soon enough
 
edwurster's Avatar
 
Status: Offline
Posts: 660
Join Date: May 2003
Location: Voorhees, NJ
Rep Power: 69
10-09-2005, 07:07 PM

Quote:
Originally Posted by ldesign
1. You need to include username and password when accessing the gzip file, so the link is
usernameassword@www.your-domain.com:2082/getaccesslog/accesslog-your-domain.com-10-9-2005.gz
This works in a browser:
www.your-domain.com:2082/getaccesslog/
then enter username and password.

Which brings us to a related question: how would we do this with one URL?
http://username:password@www.your-do...m-10-9-2005.gz


Ed Wurster - sacts92 (old root10)

...CPANEL Docs
......P-Z Server Status
  Send a message via AIM to edwurster  
Reply With Quote
Old
  (#8 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-09-2005, 08:53 PM

That single URL, with the correct username and password for your domain, works just as you have it.

(Note: I run apache, Php, and MySql on my home PC.)

In a Php script on your home computer you can fopen that URL ('rb' - read in binary mode) and then use successive freads to copy the entire compressed gzip file to your hard drive.

Or you can use gzopen and read one line (uncompressed) at a time. I am writing it in this manner so I can go to the previous offset in the file and start from there. That way I only download the newer entries (lines).

You could also run the script on the server and then use curl, or simply use the correct header statements to deliver the file to your browser.

If you want to see my Php script let me know. I'll post it here (a work in progress).

Chuck
   
Reply With Quote
Old
  (#9 (permalink))
edwurster
green side up, please!
edwurster will become famous soon enoughedwurster will become famous soon enough
 
edwurster's Avatar
 
Status: Offline
Posts: 660
Join Date: May 2003
Location: Voorhees, NJ
Rep Power: 69
10-09-2005, 09:53 PM

Quote:
Originally Posted by ldesign
That single URL, with the correct username and password for your domain, works just as you have it.

(Note: I run apache, Php, and MySql on my home PC.)

In a Php script on your home computer you can fopen that URL ('rb' - read in binary mode) and then use successive freads to copy the entire compressed gzip file to your hard drive.

Or you can use gzopen and read one line (uncompressed) at a time. I am writing it in this manner so I can go to the previous offset in the file and start from there. That way I only download the newer entries (lines).

You could also run the script on the server and then use curl, or simply use the correct header statements to deliver the file to your browser.

If you want to see my Php script let me know. I'll post it here (a work in progress).
I think the script will be very interesting for many reasons. I remember CPANEL had a feature where recent visitors was limited to just the IP. Now it shows you every associated hit. One thing I'd like to be able to do is parse info from a log file. It sounds as if you are doing something similar.

This is a great learning experience for me. I've learned a lot from solving small problems with PHP, thanks to Ronnie.


Ed Wurster - sacts92 (old root10)

...CPANEL Docs
......P-Z Server Status
  Send a message via AIM to edwurster  
Reply With Quote
Old
  (#10 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-09-2005, 10:37 PM

It is a work in progress. So, in the meantime, if there are any "pieces" of the puzzle you'd like to see, let me know specifically what you want to do.

Here is the web site that gave me my start.
http://www.devenezia.com/perl/http-log/

I'm using the preg_match from that page to parse my log files.

I'm also using my database tables in much the same way - storing id's for the requests, referrers, and user agents, so that the total database size is smaller than the logfile (by not storing redundant request, referrer, and user agent strings).

I also gleaned some information here:
http://www.php-scripts.com/php_diary/012103.php3
(specifically the tip on using an array to convert logfile dates to mysql datetime format)

Chuck
   
Reply With Quote
Old
  (#11 (permalink))
midwest
blink and it's over
midwest will become famous soon enough
 
midwest's Avatar
 
Status: Offline
Posts: 802
Join Date: Oct 2002
Location: Big Sky, MT
Rep Power: 78
10-09-2005, 11:19 PM

Quote:
Originally Posted by ldesign
2. Is there any reason to use curl instead of simply using fopen, fread to download the gzip file (or use gzopen, gzgets to read line by line)?
It would make it easier if you were to grab multiple log files from the same script. It also does not send the password within the url and slightly obfusicates it thus making it a bit more safe from sniffing but it is still considered plain text.


Ronnie Gauthier
www.instaguide.com

======================
for official page-zone support please visit
www.page-zone.com/support.shtml
   
Reply With Quote
Old
  (#12 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-10-2005, 12:14 AM

Quote:
Originally Posted by midwest
It would make it easier if you were to grab multiple log files from the same script. It also does not send the password within the url and slightly obfusicates it thus making it a bit more safe from sniffing but it is still considered plain text.
I was wondering about the security issue. I don't like sending my username and password as plain text in that URL. In fact, I just started using the secure URL when I login to cPanel from my browser.

I've been trying to use a secure URL in gzopen, but am not having any luck. i.e.,

https:// followed by
username:password@mydomain.com:2083/getaccesslog/mydomain.com-10-9-2005.gz

I get this warning message:
Warning: gzopen(... the URL above ...): failed to open stream:
Invalid argument in .......\zlib_test.php on line 2

I don't know if there is a way to do that or not, but I've tried everything I can think of.

I wonder if I can use SSL with curl.

Chuck
   
Reply With Quote
Old
  (#13 (permalink))
ldesign
Registered User
ldesign is on a distinguished road
 
ldesign's Avatar
 
Status: Offline
Posts: 251
Join Date: Feb 2004
Location: Colorado
Rep Power: 42
10-10-2005, 12:34 AM

Here's another question I have. Which would use less server resources (and is it even an issue)?

1. I can open (gzopen) the compressed log file right off the server and then process one line at a time making entries into my database (a little time consuming as I query for the referrer, request, and useragent in their own database tables to see if they are new or repeats - inserting them into the table if they are new).

2. .... Or, I can open the compressed log file on the server and copy it line by line onto my hard drive (compressed - to save space), close the connection, and then open it off my hard drive to process line by line.

I don't think it's really much of a resource issue, as I'm doing all the crunching on my PC and only keeping a connection open on the server while I read a line at a time. I don't really know, though, so I'm wondering if it makes any difference how I do it.

I think I'd like to keep using method 1, as I may move this script onto the server someday - in case I ever need to examine log files with my script from some remote location.

Chuck
   
Reply With Quote