This tutorial explains how to fetch logs in cloudflare using its GraphQL API.
We generally check logs to analyze security threats and performance issues. For example, you can identify the IP addresses of users who are performing web scraping on your website.
You need an API key and a zone ID to authenticate and use the API. You can follow the instructions below to get API Key and Zone ID.
Instructions : You can find your API key in the Cloudflare dashboard by clicking on profile icon (top-right) > My Profile > API Tokens. Either you can use global API key or you can create a custom API token with specific permissions. The Zone ID is available on the Overview page of your site in Cloudflare.
The following python code returns the logs of your website from the past 24 hours and save it to Excel file in your current working directory.
import requests from datetime import datetime, timedelta, timezone import pandas as pd from tzlocal import get_localzone end_hour = datetime.now(timezone.utc) start_hour = end_hour - timedelta(hours=24) # Format with explicit UTC designation def iso_format(dt): return dt.strftime("%Y-%m-%dT%H:%M:%SZ") # Define the Cloudflare API credentials API_KEY = "xxxxxxxxxxxxx" ZONE_ID = "xxxxxxxxxxxxxx" API_EMAIL = "[email protected]" # Set up headers headers = { "X-Auth-Email": API_EMAIL, "X-Auth-Key": API_KEY, "Content-Type": "application/json" } # Define the new GraphQL query with the required fields query = """ query GetAnalytics($zoneTag: String!, $datetime: String!) { viewer { zones(filter: { zoneTag: $zoneTag }) { httpRequestsAdaptive( filter: $filter limit: 3000 ) { datetime cacheStatus clientASNDescription clientAsn clientCountryName clientDeviceType clientIP clientRequestHTTPHost clientRequestHTTPMethodName clientRequestHTTPProtocol clientRequestPath clientRequestQuery clientRequestScheme clientSSLProtocol edgeResponseStatus leakedCredentialCheckResult originResponseDurationMs requestSource securityAction securitySource userAgent } } } } """ # Create the request body with the variables payload = { "query": query, "variables": { "zoneTag": ZONE_ID, "filter": { "datetime_geq": iso_format(start_hour), "datetime_leq": iso_format(end_hour) } } } # Make the API request response = requests.post( url="https://api.cloudflare.com/client/v4/graphql", headers=headers, json=payload ) # Process the response if response.status_code >= 400: raise Exception(f"Error: {response.status_code}, {response.text}") data = response.json() http_requests = data['data']['viewer']['zones'][0]['httpRequestsAdaptive'] # Convert the data into a pandas DataFrame df = pd.json_normalize(http_requests) df['datetime'] = pd.to_datetime(df['datetime']).dt.tz_localize(None) local_timezone = get_localzone() df['datetime'] = df['datetime'].dt.tz_localize('UTC').dt.tz_convert(local_timezone).dt.tz_localize(None) df = df.sort_values(by='datetime', ascending=False) # Write the DataFrame to an Excel file df.to_excel('cloudflare_logs.xlsx', index=False)
It returns the following 21 columns in an Excel file named “cloudflare_logs”.
Read MoreListenData