Skip to content

Route Optimization Engineering Spec

2.2 ROUTE GENERATION: DEEP TECHNICAL SPECIFICATION

2.2.1 BASE ALGORITHM: OVERVIEW

The route generation system will leverage classical optimization algorithms, hybridized with heuristics for performance, using OSRM (Open Source Routing Machine) as the backbone for travel distance and time calculations. The clustering and routing algorithms will focus on minimizing detours from a straight-line trajectory while satisfying user-defined constraints.


2.2.2 INPUT DATA PROCESSING

  1. Data Parsing and Validation:
  2. Ensure the CSV file contains valid latitude/longitude pairs.
  3. Compute a geospatial bounding box for the destinations for quick validation of data integrity.
  4. Validate that all inputs are within acceptable coordinate ranges (e.g., latitude: -90° to 90°, longitude: -180° to 180°).

  5. Distance Calculation:

  6. Compute the straight-line (Haversine) distance between all destination points and the pickup location to determine initial proximity clusters.
  7. Use OSRM to calculate real travel distance and time between points to factor road network constraints (e.g., highways, traffic regulations).

2.2.3 CLUSTERING DESTINATIONS

The purpose of clustering is to group destinations into logical route segments such that:
1. Destinations in a single route are spatially cohesive.
2. The number of destinations in a route does not exceed the user-defined N.

Clustering Methodology:
  1. K-Means++ with Distance Constraints:
  2. Initialize centroids using the farthest-point initialization method (e.g., K-Means++).
  3. Use Haversine distance for clustering to ensure geographic cohesiveness.
  4. Constraint: Limit cluster sizes to N.

  5. Hierarchical Agglomerative Clustering (HAC):

  6. Start with each destination as its own cluster.
  7. Iteratively merge clusters based on the closest neighbor (Haversine distance).
  8. Stop merging when adding a new destination violates the N constraint.

  9. Custom Clustering Heuristics (Fallback):

  10. For edge cases where user-defined constraints cannot be satisfied (e.g., too few destinations), dynamically adjust N to form feasible clusters.

2.2.4 ROUTE OPTIMIZATION WITHIN CLUSTERS

Once clusters are formed, intra-cluster route optimization is performed to determine the best sequence of destinations:

  1. Travel Cost Calculation (OSRM):
  2. Query OSRM for real travel distance and time for each pair of destinations within the cluster.
  3. Construct a directed weighted graph where nodes represent destinations, and edges represent the travel cost between them.

  4. Route Optimization Algorithm:

  5. Travelling Salesman Problem (TSP) Approximation:
    • Use Christofides' Algorithm for near-optimal solutions in polynomial time.
    • Evaluate performance trade-offs of time complexity (O(n³) vs. brute force O(n!)).
  6. Insertion Heuristics:

    • Start with a pair of closest nodes, and iteratively insert the next closest unvisited node.
    • Greedy approach for balancing runtime and route quality.
  7. Path Post-Processing:

  8. Check for overlaps or inefficiencies (e.g., revisiting nearby points unnecessarily).
  9. Recompute route segments where deviation exceeds user-defined thresholds.

2.2.5 STRAIGHT-LINE DEVIATION MINIMIZATION

Objective: Minimize detours from the straight trajectory to the final drop-off point.
  1. Deviation Metric Calculation:
  2. Compute the angle θ between the vector representing the straight-line path to the drop-off point and the vector representing the optimized route segment.
  3. Deviation is quantified as:
    [ \text{Deviation Score} = \sum_{i=1}^{n} \lvert \text{cos}(\theta_i) \rvert \cdot d_i ]
    where \( d_i \) is the segment distance, and \( \theta_i \) is the angle of deviation.

  4. Deviation Threshold Enforcement:

  5. Reject routes where the deviation exceeds the user-defined maximum threshold (either as a percentage of the segment length or absolute meters).
  6. Dynamically reroute or recluster destinations to reduce deviation.

2.2.6 CONSTRAINT MANAGEMENT

1. Destination Count Per Route:
  • Enforce N destination limit by constraining clustering or breaking oversized clusters into smaller sub-clusters.
2. Maximum Deviation Threshold:
  • Allow user-configurable thresholds in the UI for:
  • Absolute deviation in meters (e.g., 500m).
  • Relative deviation as a percentage of the straight-line distance (e.g., 10%).
3. Time-Based Constraints (Future Expansion):
  • Introduce travel time limits per route based on average traffic data.

2.2.7 PERFORMANCE OPTIMIZATION

  1. Caching:
  2. Cache OSRM results for previously computed origin-destination pairs.
  3. Use Redis or similar in-memory databases to accelerate repeated queries.

  4. Parallelization:

  5. Distribute OSRM queries and route optimizations across multiple threads or worker processes using Python’s multiprocessing or Ray.

  6. Cluster-to-Route Balancing:

  7. Rebalance clusters dynamically if route optimization results in high deviation scores.

2.2.8 ERROR HANDLING AND EDGE CASES

  1. Invalid Input:
  2. Handle malformed CSV files gracefully with descriptive error messages.
  3. Sparse Data:
  4. Ensure routing is still feasible if destinations are widely dispersed.
  5. Unsatisfiable Constraints:
  6. If user constraints (e.g., max destinations per route) are infeasible, provide alternative suggestions (e.g., increment N).

2.2.9 TECHNICAL IMPLEMENTATION STACK

  • Backend Logic: Python (NumPy, Pandas, Scipy).
  • Geospatial Calculations: Geopy and Haversine for preliminary distances.
  • Routing Engine: OSRM via HTTP API.
  • Graph Construction: NetworkX or custom graph libraries.
  • Optimization Frameworks: OR-Tools (Google) for TSP approximation.

This deeper dive provides a structured methodology for building the route generation component, balancing mathematical rigor, user constraints, and computational efficiency. Let me know if specific areas need more elaboration!